NetBioSIG2012 eriksonnhammer

Post on 10-May-2015

2.081 views 1 download

Tags:

Transcript of NetBioSIG2012 eriksonnhammer

Erik Sonnhammer

Stockholm Bioinformatics CentreScience for Life Laboratory

Dept. Biochemistry and BiophysicsStockholm University

Comparative Interactomicswith FunCoup 2.0

How to map the human interactome?

• Genes: ~22000• Interactions: 100000-300000?• Known direct interactions:

~74000 (Intact)

• Experiments have high false negative and false positive rates.

• → Most interactions needto be inferred combinatorially

FunCoup:FunCoup:

Predicting Predicting

FunFunctional ctional CoupCoupling Between Genes/Proteins ling Between Genes/Proteins

Using Genomics Data and OrthologyUsing Genomics Data and Orthology

• Alexeyenko et al., NAR 40:D821 (2012)

• Alexeyenko & Sonnhammer, Genome Research 19:1107 (2009)

FunCoup Protein-protein interactions

Co-expression patterns Phylogenetic

profilesDomain interactions

Shared transcription factor binding

Other Organisms

OrthologyShared miRNAtargeting

Subcellularco-localisation

Genetic interactions

Naïve Bayesian training

Continuous variable

Discrete categories

Extract links

Test against positive and ”negative” reference datasets

Calculate enrichment as likelihood ratio = P(+) / P(-)

1 204

+

-

+

-

+

-

-1.0 1.0

0.6 1.0

FunCoup prediction of 1 linkRaw data

Bayesian LLR score

Raw data

Bayesian LLR score

Raw data

Bayesian LLR score

Raw data

Bayesian LLR score

Raw data

Bayesian LLR score

Sum of LLR scores

Confidence valuepfc

Naïve Bayesian training• Training:

– Learn log likelihood ratios (LLRs) for each individual evidence bin– When predicting, sum all the LLRs to a full Bayesian score (FBS).

∑=

=||

1 )()|(

log)(ε

εi ij

ij

EPFCEP

FBS

FC Functional coupling

ε Set of evidencesEij Evidence i, bin j

4 training datasets → 4 different types of functional coupling

• Metabolic pathway(KEGG)

• Signalling pathway(KEGG)

• Physical protein-protein interaction

• Complex member

FunCoup training

Human

Mouse

Rat

Fly

Worm

Yeast

Plant

MEXMIR

SCLPPI

PEXPHP

TFBDOM

10 7

10 5

10 3

INPUT DATA

HumanMouse

Rat

Fly

Worm

Yeast

Plant

FC-PIFC-CM

FC-MLFC-SL

5000

10000

15000

20000

25000

TRAINING SETS

BAYESIAN FRAMEWORK

ƒx, ƒy, ƒz, …

×

ΣSL =0+0-0.6+1.2-0.4+0.2+1.2+6.8+1.4=7.9ΣSL =0+0-0.6+1.2-0.4+0.2+1.2+6.8+1.4=5.8

ΣSL =0+0-0.6+1.2-0.4+0.2+1.2+6.8+1.4=5.5

FC-SL modelFC-ML model

ΣSL =0+0-0.6+1.2-0.4+0.2+1.2+6.8+1.4=5.8ΣSL =0+0-0.6+1.2-0.4+0.2+1.2+6.8+1.4=7.9

Raw data metrics on CDC2 – KPNB1Fly MEX (Li and White, 2003) PLC=0.42Rat MEX (Di Giovanni et al., 2004) PLC=0.48Mouse SLC (UniProt, ESLDB) WMI=0.04Mouse MEX (Zapala et al., 2005) PLC=0.70Mouse MEX (Su et al., 2004) PLC= -0.01Mouse MEX (Siddiqui et al., 2005) PLC=0.56Mouse MEX (Hutton et al., 2004) PLC=0.61Human PPI (IntAct, HPRD, BIND) PPI score=0.17Human MEX (Su et al., 2004) PLC=0.60…

FC-PI modelFBSPI = 0+0-0.6+1.2-0.4+0.2+1.2+6.3+1.4…= 11.2

FC-CM model

FC-SL modelFC-ML model

FC-PI modelFBSPI = 0+0-0.6+1.2-0.4+0.2+1.2+6.3+1.4…= 11.2

FC-CM model

(pfc scores)

FBS score and pfc confidence

∏∏

==

=

+= ||

1

||

1

||

1

)()|()(

)|()()( εε

ε

ε

iij

iij

iij

EPFCEPFCP

FCEPFCPpfc

∑=

=||

1 )()|(

log)(ε

εi ij

ij

EPFCEP

FBSFC Functional coupling

ε Set of evidencesEij Evidence i, bin j

The total human FunCoup 2.0 network

0500,000

1,000,0001,500,0002,000,0002,500,0003,000,0003,500,0004,000,0004,500,0005,000,000

Nr of links

0.1 0.25 0.75Confidence cutoff

Nr of links at pfc cutoffs

0

2000000

4000000

6000000

8000000

10000000

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95

pfc cutoff 

# lin

ks

H. sapiens

M. musculus

R. norvegicus

C. familiaris

D. rerio

C. intestinalis

D. melanogaster

C. elegans

G. gallus

A. thaliana

Comparison to STRING

• FunCoup on average 75% larger (based on all links)

A. thalianaC. elegans

C. familiarisC. intestinalis

D. melanogasterD. rerio

G. gallusH. sapiens

M. musculusR. norvegicus

S. cerevisiae

0

1000000

2000000

3000000

4000000

5000000

FunCoup 2.0STRING 9.0

Support from species and evidence type

MEX: mRNA co-expression

PHP: phylogenetic profile similarity

PPI: protein–protein interaction

SCL: sub-cellular co-localization

MIR: co-miRNA regulation by shared miRNA targeting

DOM: domain interactions

PEX: protein co-expression

TFB: shared transcription factor binding

GIN: genetic interaction profile similarity

Validation: Recovering cancer pathways

• 36 signalling links in RTK/RAS/PI(3)K, p53, and RB signalling pathways (TCGARN, Science 2008).

• FunCoup predicted 29 of 36 links.

• 25 more links found.

Independent validation:Recovering tumour mutation sets

• Lists of genes co-mutated in glioblastoma tumours (The Cancer Genome Atlas).

• 6 of 9 lists (>= 10 genes) enriched (p<10-3) with internal FunCoup connections compared to random networks (preserving degree distribution).

FunCoup

Cross-talk between groups

Find novel interactions

Find network modules

Extend pathways

Find novel disease genes

FunCoup applications

http://FunCoup.sbc.su.se

ASPM - Abnormal spindle-like microcephaly-associated protein

ASPM

Data details

Klammer M, Roopra S, Sonnhammer EL. ”jSquid: a Java applet for graphical on-line network exploration” Bioinformatics 2008, 24:1467

Comparative interactomics

New in FunCoup 2.0 – ensures true conservation

Human presenilin in worm

RNA-polymerase II subunits: yeast-all

Comparative interactomicsApplications

• Hypothesis testing– Is a given pathway/complex conserved in another species?

• New discoveries– Finding ortholog pairs with conserved functional coupling – very

strong evidence for functional conservation– Can also find conservation that is not strictly 4-way: