Pathways, Networks and Systems Biology OR “what do I do with my gene list?” BMI 705

53
Pathways, Networks and Systems Biology OR “what do I do with my gene list?” BMI 705 Kun Huang Department of Biomedical Informatics Ohio State University

description

Pathways, Networks and Systems Biology OR “what do I do with my gene list?” BMI 705. Kun Huang Department of Biomedical Informatics Ohio State University. Gene Enrichment Analysis Gene Ontology / Pathways / Networks Databases and Resources Gene Regulation (cis-)Networks - PowerPoint PPT Presentation

Transcript of Pathways, Networks and Systems Biology OR “what do I do with my gene list?” BMI 705

Page 1: Pathways, Networks and Systems Biology OR  “what do I do with my gene list?” BMI 705

Pathways, Networks and Systems Biology

OR “what do I do with my gene list?”

BMI 705 Kun Huang

Department of Biomedical InformaticsOhio State University

Page 2: Pathways, Networks and Systems Biology OR  “what do I do with my gene list?” BMI 705

Gene Enrichment Analysis• Gene Ontology / Pathways / Networks• Databases and Resources

Gene Regulation (cis-)Networks

Challenges in system biology• New computation and modeling methods• Kinetics vs. dynamics

Scale-Free Network and Network Motifs

Page 3: Pathways, Networks and Systems Biology OR  “what do I do with my gene list?” BMI 705

Where do I get the gene list?• Comparative study

e.g., microarray experiments between two types of samples or two disease states (can also be from RT-PCA, proteomics, …)

• Clustering / classification of genese.g., co-expressed genes

• Homologue analysise.g., genes from BLAST

• Other sources

Page 4: Pathways, Networks and Systems Biology OR  “what do I do with my gene list?” BMI 705

What do I do with the gene list?• Find commonality among the gene

Common biological functionsCommon molecular processes Common cellular componentsCommon pathwaysInteract with common genesCommon sequences / molecular structuresRegulated by common Transcription FactorsInvolved in the same disease…

• Generate new hypothesis based on the commonality

Page 5: Pathways, Networks and Systems Biology OR  “what do I do with my gene list?” BMI 705

How do I find commonality from my gene list?

• Using a priori knowledge (e.g., gene ontology, pathway, annotation, etc.)

• Fisher’s exact test (chi-square based)

• Other statistical method• Good news – most of the time you can use

software to do it

How significant is the intersection?

Page 6: Pathways, Networks and Systems Biology OR  “what do I do with my gene list?” BMI 705

What softwares are available?• Many• DAVID (http://david.abcc.ncifcrf.gov/)• Cytoscape

• GOTerm• BiNGO

• GSEA•GenMapp (Free)•Pathway Architect (Commercial)•Pathway Studio (Commercial)•Ingenuity Pathway Analysis (Commercial)

• Manually curated• On-demand computation

Page 7: Pathways, Networks and Systems Biology OR  “what do I do with my gene list?” BMI 705

Genes Functions, pathways and networks

Page 8: Pathways, Networks and Systems Biology OR  “what do I do with my gene list?” BMI 705

Pathway – What’s out there?

240

Page 9: Pathways, Networks and Systems Biology OR  “what do I do with my gene list?” BMI 705
Page 10: Pathways, Networks and Systems Biology OR  “what do I do with my gene list?” BMI 705
Page 11: Pathways, Networks and Systems Biology OR  “what do I do with my gene list?” BMI 705

Ingenuity Pathway Analysis (IPA)

Page 12: Pathways, Networks and Systems Biology OR  “what do I do with my gene list?” BMI 705
Page 13: Pathways, Networks and Systems Biology OR  “what do I do with my gene list?” BMI 705

Gene Enrichment Analysis• Gene Ontology / Pathways / Networks• Databases and Resources

Gene Regulation (cis-)Networks

Challenges in system biology• New computation and modeling methods• Kinetics vs. dynamics

Scale-Free Network and Network Motifs

Page 14: Pathways, Networks and Systems Biology OR  “what do I do with my gene list?” BMI 705

Transcription in higher eukaryotesTranscription in higher eukaryotes

Adapted from Wesserman & Sandelin, 2004, Nature Rev. Genetics

TFBS: Transcription Factor Binding Sites

proximal promoter region

distal promoter region

Gene Expression

1. Chromatin structure

2. Initiation of transcription

3. Processing of transcripts

4. Transport to cytoplasm

5. mRNA translation

6. mRNA stability

7. Protein activity stability

Page 15: Pathways, Networks and Systems Biology OR  “what do I do with my gene list?” BMI 705

Characterization of transcriptional regulation

Annotating regulatory regions (TSS and Promoter)

Identifying cis-regulatory modules

Deciphering logic of regulatory networks

Page 16: Pathways, Networks and Systems Biology OR  “what do I do with my gene list?” BMI 705

Transcriptional regulatory module

• cis-regulatory elements are sequence-specific regions transcription factors bind

AGGCTA

AGGCTA

CGGTTAAG

CGGTTAAG

GCTAACGC

GCTAACGC• TFs combinatorially

associate with each other to form modules and regulate their target genes

Page 17: Pathways, Networks and Systems Biology OR  “what do I do with my gene list?” BMI 705

Gene regulatory network

Page 18: Pathways, Networks and Systems Biology OR  “what do I do with my gene list?” BMI 705

Identify Cis-Regulatory Element•TFs bind to cis-acting regulator elements (CAREs).

•CAREs are DNA motifs of length 5 – 20 (e.g., 5’

CGGnnnnnnnnnnnCCG 3’, the binding site for yeast TF,

Gal4).

•Most CAREs are in the 5’ vicinity of the gene (promoter),

but some have been identified downstream. •Algorithms focus on identify common motifs.

• Words count.• Probabilistic methods (weight matrix, combined with

EM search).

• Phylogenetic footprinting.

•Other features: CpG island.

Page 19: Pathways, Networks and Systems Biology OR  “what do I do with my gene list?” BMI 705

Example – from JASPAR Database• AGL3

A [ 0 3 79 40 66 48 65 11 65 0 ] C [94 75 4 3 1 2 5 2 3 3 ] G [ 1 0 3 4 1 0 5 3 28 88 ] T [ 2 19 11 50 29 47 22 81 1 6 ]

Page 20: Pathways, Networks and Systems Biology OR  “what do I do with my gene list?” BMI 705

Example Workflow

i candidate Motifs

Screen against TRANSFAC

n final known and novel Motifs

Gene list

Ab Initio Motifs Discovery Programs

(Weeder and MEME)

Question : How do you extract upstream sequences for genes?

Extract promoter sequences

Multiple sequence alignment

Manual selection

Page 21: Pathways, Networks and Systems Biology OR  “what do I do with my gene list?” BMI 705

ChIPMotifs (from Dr. Victor Jin)

m final statistical significant candidate Motifs

Bootstrap re-sampling approach to determine optimal cutoff of Motifs and screen against non-

enrichment sequences

i candidate Motifs

Screen against TRANSFAC

n final known and novel Motifs

k>i>m>n

k Top Level Sequences

Ab Initio Motifs Discovery Programs(Weeder and MEME)

Question : How do you extract upstream sequences for genes?

Page 22: Pathways, Networks and Systems Biology OR  “what do I do with my gene list?” BMI 705

Gene Enrichment Analysis• Gene Ontology / Pathways / Networks• Databases and Resources

Gene Regulation (cis-)Networks

Challenges in system biology• New computation and modeling methods• Kinetics vs. dynamics

Scale-Free Network and Network Motifs

Page 23: Pathways, Networks and Systems Biology OR  “what do I do with my gene list?” BMI 705

BiologyDomain knowledge

• Hypothesis testingExperimental work

• Genetic manipulation• Quantitative measurement• Validation

System SciencesTheoryAnalysisModeling

• Synthesis/prediction• Simulation• Hypothesis generation

InformaticsData management

• DatabaseComputational infrastructure

• Modeling tools• High performance computing

Visualization

System Biology

Understanding! Prediction!

Page 24: Pathways, Networks and Systems Biology OR  “what do I do with my gene list?” BMI 705

“A key element of the GTL program is an integrated computing and technology infrastructure, which is essential for timely and affordable progress in research and in the development of biotechnological solutions. In fact, the new era of biology is as much about computing as it is about biology. Because of this synergism, GTL is a partnership between our two offices within DOE’s Office of Science—the Offices of Biological and Environmental Research and Advanced Scientific Computing Research.

Only with sophisticated computational power and information management can we apply new technologies and the wealth of emerging data to a comprehensive analysis of the intricacies and interactions that underlie biology. Genome sequences furnish the blueprints, technologies can produce the data, and computing can relate enormous data sets to models linking genome sequence to biological processes and function.”

Page 25: Pathways, Networks and Systems Biology OR  “what do I do with my gene list?” BMI 705

Taniguchi et al. Nature Reviews Molecular Cell Biology 7, 85–96 (February 2006) | doi:10.1038/nrm1837

Page 26: Pathways, Networks and Systems Biology OR  “what do I do with my gene list?” BMI 705

Challenges in system biology• Large data• Kinetics vs. dynamics• Multiple (temporal) scale• New computation and modeling methods• New mathematics or new physics laws

Page 27: Pathways, Networks and Systems Biology OR  “what do I do with my gene list?” BMI 705
Page 28: Pathways, Networks and Systems Biology OR  “what do I do with my gene list?” BMI 705

A B

Oscillation

Maeda et al., Science, 304(5672):875-878, 2004

Page 29: Pathways, Networks and Systems Biology OR  “what do I do with my gene list?” BMI 705

Simple Two Nodes Pattern

Bistable dynamics in a two-gene system with cross-regulation. A. Gene regulatory circuit diagram. Blunt arrows indicate mutual inhibition of genes X and Y. Dashed arrows indicate a basal synthesis (affected by the inhibition) and an independent first-order degradation of the factors. B. Two-dimensional XY phase plane representing the typical dynamics of the circuit. Every point (X, Y) represents a momentary state defined by the values of the pair X, Y. Red arrows are gradient vectors indicating the direction and extent that the system will move to within a unit time at each of the (X, Y) positions. Collectively, the vector field gives rise to a "potential landscape", visualized by the colored contour lines (numerical approximation). In this "epigenetic landscape", the stable states (attractors) are in the lowest points in the valleys: a (X>>Y) and b (Y>>X) (gray dots). C. Schematic representation of the epigenetic landscape as a section through a and b in which every red dot represents a cell. Experimentally, this bistability is manifested as a bimodal distribution in flow cytometry histograms in which the stable states a and b appear as peaks at the respective level of marker expression (e.g., Y).

Chang et al., Multistable and multistep dynamics in neutrophil differentiation, BMC Cell Biology 2006, 7:11

Page 30: Pathways, Networks and Systems Biology OR  “what do I do with my gene list?” BMI 705

Marlovits et.al., Biophysical Chemistry, Vol:72, p.169-184

Page 31: Pathways, Networks and Systems Biology OR  “what do I do with my gene list?” BMI 705

Pomerening et.al., Cell, Vol:122(4), p.565-578

Page 32: Pathways, Networks and Systems Biology OR  “what do I do with my gene list?” BMI 705

New system biology

Kinetics vs. Dynamics

Compartmentalization (Spatial and Temporal)

Hybrid Systems and System Abstraction• Hierarchical/multiscale description• Discrete Event System • New System Theory

Graph Theory and Network Theory / New Mathematics and New Physics

Page 33: Pathways, Networks and Systems Biology OR  “what do I do with my gene list?” BMI 705

Gene Enrichment Analysis• Gene Ontology / Pathways / Networks• Databases and Resources

Gene Regulation (cis-)Networks

Challenges in system biology• New computation and modeling methods• Kinetics vs. dynamics

Scale-Free Network and Network Motifs

Page 34: Pathways, Networks and Systems Biology OR  “what do I do with my gene list?” BMI 705

A Tale of Two GroupsA.-L. Barabasi at University of Notre DameTen Most Cited Publications:

Albert-László Barabási and Réka Albert, Emergence of scaling in random networks , Science 286, 509-512 (1999). [ PDF ] [ cond-mat/9910332 ]

Réka Albert and Albert-László Barabási, Statistical mechanics of complex networks Review of Modern Physics 74, 47-97 (2002). [ PDF ] [cond-mat/0106096 ]

H. Jeong, B. Tombor, R. Albert, Z.N. Oltvai, and A.-L. Barabási, The large-scale organization of metabolic networks, Nature 407, 651-654 (2000). [ PDF ] [ cond-mat/0010278 ]

R. Albert, H. Jeong, and A.-L. Barabási, Error and attack tolerance in complex networksNature 406 , 378 (2000). [ PDF ] [ cond-mat/0008064 ]

R. Albert, H. Jeong, and A.-L. Barabási, Diameter of the World Wide Web Nature 401, 130-131 (1999). [ PDF ] [ cond-mat/9907038 ]

H. Jeong, S. Mason, A.-L. Barabási and Zoltan N. Oltvai, Lethality and centrality in protein networksNature 411, 41-42 (2001). [ PDF ] [ Supplementary Materials  1,   2  ]

E. Ravasz, A. L. Somera, D. A. Mongru, Z. N. Oltvai, and A.-L. Barabási, Hierarchical organization of modularity in metabolic networks, Science 297, 1551-1555 (2002). [ PDF ] [ cond-mat/0209244 ] [ Supplementary Material ]

A.-L. Barabási, R. Albert, and H. Jeong, Mean-field theory for scale-free random networks Physica A 272, 173-187 (1999). [ PDF ] [ cond-mat/9907068 ]

Réka Albert and Albert-László Barabási, Topology of evolving networks: Local events and universality Physical Review Letters 85, 5234 (2000). [ PDF ] [ cond-mat/0005085 ]

Albert-László Barabási and Zoltán N. Oltvai, Network Biology: Understanding the cells's functional organization, Nature Reviews Genetics 5, 101-113 (2004). [ PDF ]

Page 35: Pathways, Networks and Systems Biology OR  “what do I do with my gene list?” BMI 705

A Tale of Two GroupsUri Alon at Weissman InstituteSelected Publications:R Milo, S Itzkovitz, N Kashtan, R Levitt, S Shen-Orr, I Ayzenshtat, M Sheffer & U Alon , Superfamilies of designed and evolved networks, Science, 303:1538-42 (2004). Pdf. R Milo, S Shen-Orr, S Itzkovitz, N Kashtan, D Chklovskii & U Alon, Network Motifs: Simple Building Blocks of Complex Networks, Science, 298:824-827 (2002). Pdf. S Shen-Orr, R Milo, S Mangan & U Alon, Network motifs in the transcriptional regulation network of Escherichia coli. Nature Genetics, 31:64-68 (2002). Pdf. S. Mangan, S. Itzkovitz, A. Zaslaver and U. Alon, The Incoherent Feed-forward Loop Accelerates the Response-time of the gal System of Escherichia coli. JMB, Vol 356 pp 1073-81 (2006). Pdf. S Mangan & U Alon, Structure and function of the feed-forward loop network motif. PNAS, 100:11980-11985 (2003). Pdf.

S. Mangan, A. Zaslaver and U. Alon, The Coherent Feedforward Loop Serves as a Sign-sensitive Delay Element in Transcription Networks. JMB, Vol 334/2 pp 197-204 (2003). Pdf. Guy Shinar, Erez Dekel, Tsvi Tlusty & Uri Alon, Rules for biological regulation based on error minimization, PNSA. 103(11), 3999-4004 (2006). Pdf. Alon Zaslaver, Avi E Mayo, Revital Rosenberg, Pnina Bashkin, Hila Sberro, Miri Tsalyuk, Michael G Surette & Uri Alon, Just-in-time transcription program in metabolic pathways, Nature Genetics 36, 486 - 491 (2004). Pdf. U. Alon, M.G. Surette, N. Barkai, S. Leibler, Robustness in Bacterial Chemotaxis, Nature 397,168-171 (1999). Pdf M Ronen, R Rosenberg, B Shraiman & U Alon, Assigning numbers to the arrows: Parameterizing a gene regulation network by using accurate expression kinetics. PNAS, 99:10555–10560 (2002). Pdf. N Rosenfeld, M Elowitz & U Alon, Negative Autoregulation Speeds the Response Times of Transcription Networks, JMB, 323:785-793 (2002). Pdf. N Rosenfeld & U Alon, Response Delays and the Structure of Transcription Networks, JMB, 329:645–654 (2003). Pdf. S. Kalir, J. McClure, K. Pabbaraju, C. Southward, M. Ronen, S. Leibler, M.G. Surette, U. Alon , Ordering genes in a flagella pathway by analysis of expression kinetics from living bacteria. Science, 292:2080-2083 (2001). Pdf Y. Setty, A. E. Mayo, M. G. Surette, and U. Alon, Detailed map of a cis-regulatory input function, PNAS, 100:7702-7707 (2003). Pdf. Shiraz Kalir and Uri Alon, Using a Quantitative Blueprint to Reprogram the Dynamics of the Flagella Gene Network, Cell, 117:713–720, (2004). Pdf.

Page 36: Pathways, Networks and Systems Biology OR  “what do I do with my gene list?” BMI 705
Page 37: Pathways, Networks and Systems Biology OR  “what do I do with my gene list?” BMI 705

Small world phenomena (http://smallworld.columbia.edu)

P(k) ~ k-

Fou

nd

R. Albert, H. Jeong, A-L Barabasi, Nature, 401 130 (1999).

Exp

ected

Page 38: Pathways, Networks and Systems Biology OR  “what do I do with my gene list?” BMI 705

Other Observations:

• Scientific citations• Paper coauthorship/collaboration• Organization structure• Social structure• Actor joint casting in movies• Online communities• Websites linkage• …• Protein networks• Gene networks• Cell function networks• …

Page 39: Pathways, Networks and Systems Biology OR  “what do I do with my gene list?” BMI 705

Scale-Free Networks

Page 40: Pathways, Networks and Systems Biology OR  “what do I do with my gene list?” BMI 705

Metabolic network

Organisms from all three domains of life are scale-free networks!

H. Jeong, B. Tombor, R. Albert, Z.N. Oltvai, and A.L. Barabasi, Nature, 407 651 (2000)

Archaea Bacteria Eukaryotes

Page 41: Pathways, Networks and Systems Biology OR  “what do I do with my gene list?” BMI 705

Power Law Small World

Rich Get Richer(preferential attachment) Self-similarity

HUBS!

Page 42: Pathways, Networks and Systems Biology OR  “what do I do with my gene list?” BMI 705

Preferential attachment in protein Interaction networks

k vs. k : increase in the No. of links in a unit time

No PA: k is independent of k

PA: k ~k

Eisenberg E, Levanon EY, Phys. Rev. Lett. 2003

Jeong, Neda, A.-L.B, Europhys. Lett. 2003

jj

ii k

kk

)(

Page 43: Pathways, Networks and Systems Biology OR  “what do I do with my gene list?” BMI 705

Nature Biotechnology  18, 1257 - 1261 (2000) doi:10.1038/82360 A network of protein−protein interactions in yeastBenno Schwikowski, Peter Uetz & Stanley Fields

Page 44: Pathways, Networks and Systems Biology OR  “what do I do with my gene list?” BMI 705

Nature Biotechnology  18, 1257 - 1261 (2000) doi:10.1038/82360 A network of protein−protein interactions in yeast

Benno Schwikowski, Peter Uetz & Stanley Fields

Page 45: Pathways, Networks and Systems Biology OR  “what do I do with my gene list?” BMI 705

C. Elegans

Li et al. Science 2004

Drosophila M.

Giot et al. Science 2003

Page 46: Pathways, Networks and Systems Biology OR  “what do I do with my gene list?” BMI 705

Nature 408 307 (2000)

“One way to understand the p53 network is to compare it to the Internet. The cell, like the Internet, appears to be a ‘scale-free network’.”

Consequence 1 : Hubs and Robustness

Page 47: Pathways, Networks and Systems Biology OR  “what do I do with my gene list?” BMI 705

Consequence 1 : Hubs and RobustnessComplex systems maintain their basic functions even under errors and failures

(cell mutations; Internet router breakdowns)

node failure

fc

0 1Fraction of removed nodes, f

1

S

Page 48: Pathways, Networks and Systems Biology OR  “what do I do with my gene list?” BMI 705

Hubs and RobustnessComplex systems maintain their basic functions even under errors and failures

(cell mutations; Internet router breakdowns)

R. Albert, H. Jeong, A.L. Barabasi, Nature 406 378 (2000)

Page 49: Pathways, Networks and Systems Biology OR  “what do I do with my gene list?” BMI 705

Achilles’ Heel of complex networks

Internet

failureattack

R. Albert, H. Jeong, A.L. Barabasi, Nature 406 378 (2000)

Page 50: Pathways, Networks and Systems Biology OR  “what do I do with my gene list?” BMI 705

Yeast protein network- lethality and topological position

Highly connected proteins are more essential (lethal)...

H. Jeong, S.P. Mason, A.-L. Barabasi, Z.N. Oltvai, Nature 411, 41-42 (2001)

Page 51: Pathways, Networks and Systems Biology OR  “what do I do with my gene list?” BMI 705

Subgraphs

• Subgraph: a connected graph consisting of a subset of the nodes and links of a network

• Subgraph properties:n: number of nodes

m: number of links

(n=3,m=3)

(n=3,m=2)

(n=4,m=4)

(n=4,m=5)

.

Page 52: Pathways, Networks and Systems Biology OR  “what do I do with my gene list?” BMI 705

R Milo et al., Science 298, 824-827 (2002).

Page 53: Pathways, Networks and Systems Biology OR  “what do I do with my gene list?” BMI 705

System biology

• Integration

• Computation

• Theory

Prediction!!!