k-PathA: k shortest Path Algorithm - uni-leipzig.dealexander/HIBI09_talk.pdfk-PathA: k-shortest Path...

22
k-PathA: k -shortest Path Algorithm (pronounce as “Qu´ e Pasa”) Alexander Ullrich and Christian V. Forst Chair for Bioinformatics University of Leipzig Department of Clinical Sciences University of Texas Southwestern Medical Center HiBi 2009, Trento, October 14-16

Transcript of k-PathA: k shortest Path Algorithm - uni-leipzig.dealexander/HIBI09_talk.pdfk-PathA: k-shortest Path...

Page 1: k-PathA: k shortest Path Algorithm - uni-leipzig.dealexander/HIBI09_talk.pdfk-PathA: k-shortest Path Algorithm (pronounce as “Qu´e Pasa”) Alexander Ullrich and Christian V. Forst

k-PathA: k-shortest Path Algorithm

(pronounce as “Que Pasa”)

Alexander Ullrich and Christian V. Forst

Chair for BioinformaticsUniversity of Leipzig

Department of Clinical SciencesUniversity of Texas Southwestern Medical Center

HiBi 2009, Trento, October 14-16

Page 2: k-PathA: k shortest Path Algorithm - uni-leipzig.dealexander/HIBI09_talk.pdfk-PathA: k-shortest Path Algorithm (pronounce as “Qu´e Pasa”) Alexander Ullrich and Christian V. Forst

The Situation

• Situation: Tons of available data on different levels

• Challenge: Derive useful information of structure and behaviorof the biological systems

2 / 22

Page 3: k-PathA: k shortest Path Algorithm - uni-leipzig.dealexander/HIBI09_talk.pdfk-PathA: k-shortest Path Algorithm (pronounce as “Qu´e Pasa”) Alexander Ullrich and Christian V. Forst

The Systems-Biology Approach

• Before SB: extensive research on single components

• SB says: interactions between components are more important

• SB looks beyond single genes, proteins, etc..

• SB looks at networks because they can give more insights

• SB combines data from different levels, components,conditions

3 / 22

Page 4: k-PathA: k shortest Path Algorithm - uni-leipzig.dealexander/HIBI09_talk.pdfk-PathA: k-shortest Path Algorithm (pronounce as “Qu´e Pasa”) Alexander Ullrich and Christian V. Forst

Gene Expression

• Cell response to external conditions

• Expression Change = Control vs Experiment

• Biomarkers = Genes with high Expression Change

• Correlated genes are likely to share function

• Gene expression profiling, Gene clustering show some success

• But ...

4 / 22

Page 5: k-PathA: k shortest Path Algorithm - uni-leipzig.dealexander/HIBI09_talk.pdfk-PathA: k-shortest Path Algorithm (pronounce as “Qu´e Pasa”) Alexander Ullrich and Christian V. Forst

Response Networks

5 / 22

Page 6: k-PathA: k shortest Path Algorithm - uni-leipzig.dealexander/HIBI09_talk.pdfk-PathA: k-shortest Path Algorithm (pronounce as “Qu´e Pasa”) Alexander Ullrich and Christian V. Forst

Gene Clusters vs Response Networks

• Genes in Response networks do not have to be active under allconditions

• Response networks underlie the constraints of the molecularinteraction network

• Also non-correlated Genes can be included if they connectcorrelated genes

• Response networks of different conditions can be combined(response networks of new drugs can be compared to those ofknown drugs)

6 / 22

Page 7: k-PathA: k shortest Path Algorithm - uni-leipzig.dealexander/HIBI09_talk.pdfk-PathA: k-shortest Path Algorithm (pronounce as “Qu´e Pasa”) Alexander Ullrich and Christian V. Forst

Response Networks - Computation

• Generated Sub-Networks have to be scored and comparedagainst random Sub-Networks

• Computationally costly - Heuristics

• Still costly for large Networks - parallel Heuristics

7 / 22

Page 8: k-PathA: k shortest Path Algorithm - uni-leipzig.dealexander/HIBI09_talk.pdfk-PathA: k-shortest Path Algorithm (pronounce as “Qu´e Pasa”) Alexander Ullrich and Christian V. Forst

k-PathA Approach - Overview

• Construction of Network

• Choice of Seednodes

• Generation ofSub-Network by spanningk-shortest Paths

• Scoring Sub-Networks

• Optimizing - Iterating

8 / 22

Page 9: k-PathA: k shortest Path Algorithm - uni-leipzig.dealexander/HIBI09_talk.pdfk-PathA: k-shortest Path Algorithm (pronounce as “Qu´e Pasa”) Alexander Ullrich and Christian V. Forst

Network Construction

Nodes and Edges

+

Edges

+

Node- and Edge-Weights

=

Network

9 / 22

Page 10: k-PathA: k shortest Path Algorithm - uni-leipzig.dealexander/HIBI09_talk.pdfk-PathA: k-shortest Path Algorithm (pronounce as “Qu´e Pasa”) Alexander Ullrich and Christian V. Forst

Subnetworks - Preperation

• Choice of the Sub-Network borders = Seed Nodes

• Option 1: User Input

• Option 2: Randomly

• Option 3: Set of Nodes with highest weights

• In the Optimization step the set is changed

10 / 22

Page 11: k-PathA: k shortest Path Algorithm - uni-leipzig.dealexander/HIBI09_talk.pdfk-PathA: k-shortest Path Algorithm (pronounce as “Qu´e Pasa”) Alexander Ullrich and Christian V. Forst

Sub-Networks - Generation

DefinitionThe k-th shortest simple path problem consists of thedetermination of a set of simple (loopless) paths between twonodes Pk = {p1, p2, . . . , pk} ⊆ P , such that∀p ∈ P − Pk ∧ pk ∈ Pk : cost(pk) ≤ cost(p).

• Spanning k-shortest simple (loopless) Paths

• Forward-, Backward Chaining Approach = Search from bothSeeds

• Combination of Paths from both sides11 / 22

Page 12: k-PathA: k shortest Path Algorithm - uni-leipzig.dealexander/HIBI09_talk.pdfk-PathA: k-shortest Path Algorithm (pronounce as “Qu´e Pasa”) Alexander Ullrich and Christian V. Forst

Sub-Networks - Generation

12 / 22

Page 13: k-PathA: k shortest Path Algorithm - uni-leipzig.dealexander/HIBI09_talk.pdfk-PathA: k-shortest Path Algorithm (pronounce as “Qu´e Pasa”) Alexander Ullrich and Christian V. Forst

Sub-Networks - Scoring

• Expression values are normalized

• Nodeweights are ≥ 0

• < 1 down-regulated

• > 1 up-regulated

• LinkCost(lx ,y) = log(|x + y |)

• PathwayCost(p) =

P

l∈LpLinkCost(l)

|L|k

13 / 22

Page 14: k-PathA: k shortest Path Algorithm - uni-leipzig.dealexander/HIBI09_talk.pdfk-PathA: k-shortest Path Algorithm (pronounce as “Qu´e Pasa”) Alexander Ullrich and Christian V. Forst

Sub-Networks - Combination

• Combine Sub-Networks for different Conditions

• Can show Similarities of Drug Responses

• PC (X ,Y ) = (E(XY )−E(X )E(Y ))√Var(X )Var(Y )

• FPC (X ,Y ) = PC (X (2,n)Y (1,n−1))

• BPC (X ,Y ) = PC (Y (2,n)X (1,n−1))

14 / 22

Page 15: k-PathA: k shortest Path Algorithm - uni-leipzig.dealexander/HIBI09_talk.pdfk-PathA: k-shortest Path Algorithm (pronounce as “Qu´e Pasa”) Alexander Ullrich and Christian V. Forst

Sub-Networks - Optimization

• Change of Seed nodes in every iteration

• Genetic algorithm: e.g. hill climbing

15 / 22

Page 16: k-PathA: k shortest Path Algorithm - uni-leipzig.dealexander/HIBI09_talk.pdfk-PathA: k-shortest Path Algorithm (pronounce as “Qu´e Pasa”) Alexander Ullrich and Christian V. Forst

Parallel Implementation

• Communication through message passing

• No shared Memory necessary

• 3 types of Nodes

• Master - Sub-Network Handling

• Forward - Path search and Common Node Checking

• Backward - Path search and Combination of Paths

16 / 22

Page 17: k-PathA: k shortest Path Algorithm - uni-leipzig.dealexander/HIBI09_talk.pdfk-PathA: k-shortest Path Algorithm (pronounce as “Qu´e Pasa”) Alexander Ullrich and Christian V. Forst

Parallel Implementation Behavior

3 4 5 6 7 8 9 10 11 12 13 14 15# processors

0

200

400

600

800

1000

com

puta

tion

time

in s

l = 8l = 10l = 12

• For larger Networks or high k and l: it scales linear

• For small Networks: communication is big part

17 / 22

Page 18: k-PathA: k shortest Path Algorithm - uni-leipzig.dealexander/HIBI09_talk.pdfk-PathA: k-shortest Path Algorithm (pronounce as “Qu´e Pasa”) Alexander Ullrich and Christian V. Forst

Application

• Human-hybrid Network: 45,041 nodes + 438,567 edges

• Human bronchial cells

• Influenza (H5N1) infection (24h)

• Control: Mock infection (24h)

• k = 3, l = 13

18 / 22

Page 19: k-PathA: k shortest Path Algorithm - uni-leipzig.dealexander/HIBI09_talk.pdfk-PathA: k-shortest Path Algorithm (pronounce as “Qu´e Pasa”) Alexander Ullrich and Christian V. Forst

Application

• Two well known processes affected by viral infections arepresent in the response network

• Cell-Cycle Genes (cdk2 and cdc’s) are up-regulated• Transcription factors known as early responders (fos and jun)

are down-regulated19 / 22

Page 20: k-PathA: k shortest Path Algorithm - uni-leipzig.dealexander/HIBI09_talk.pdfk-PathA: k-shortest Path Algorithm (pronounce as “Qu´e Pasa”) Alexander Ullrich and Christian V. Forst

Application

• Comparison with other viral infection (LMCV, Djavani et al)• Cell Cycle Process is modulated to induce apoptosis in the

same way• Immune Response is targeted similarly• Moderate differences in the Host Response

20 / 22

Page 21: k-PathA: k shortest Path Algorithm - uni-leipzig.dealexander/HIBI09_talk.pdfk-PathA: k-shortest Path Algorithm (pronounce as “Qu´e Pasa”) Alexander Ullrich and Christian V. Forst

Conclusions

• Computationally feasible through Parallel Implementation

• No shared Memory needed

• Flexible because new kinds of interactions can be introduced

• Flexible because of different scoring functions

• Response Networks for different conditions

• Comparison with other drug or viral responses possible

21 / 22

Page 22: k-PathA: k shortest Path Algorithm - uni-leipzig.dealexander/HIBI09_talk.pdfk-PathA: k-shortest Path Algorithm (pronounce as “Qu´e Pasa”) Alexander Ullrich and Christian V. Forst

Acknowledgements

Christian V. Forst

Electra Sutton

Lawrence Cabusora

Money: Volkswagen Stiftung

For your Attention

22 / 22