Human: 78 tissues (Su et al, 2004) Stastical significance P. falciparum: intra-erythrocytic...

1
Human: 78 tissues (Su et al, 2004) Stastical significance P. falciparum: intra- erythrocytic development cycle Yeast: 78 co-expression clusters From k-mers to motifs Statistical significance What is FIRE ? FIRE (for Finding Informative Regulatory Elements) is a highly sensitive approach for motif discovery from expression data, based on mutual information. It has the following characteristics: highly sensitive, with very few false positive predictions, if any, applicable to any type of expression data, obviates assumptions and parameter tuning often required by existing methods, simultaneously finds DNA and RNA motifs and explores their functional relationships, v) scales well to mammalian genomes, highlights the biological role of predicted motifs, their inter-species conservation, and spatial and orientation biases, characterizes motif interactions and co- localizations displays the results in a user-friendly graphical format. FIRE uses mutual information to discover and characterize motifs Systematic exploration of cis-regulation using a generic computational framework Olivier Elemento*, Noam Slonim* (equal contribution) and Saeed Tavazoie Lewis-Sigler Institute for Integrative Genomics, Princeton University Discrete Continuou s 1 1 1 1 1 2 2 2 0 0 0 0 0.45 0.12 0.01 - 0.08 -0.87 - 1.56 -2.32 -2.89 1.5 4 1.9 8 3.50 4.39 6.45 5’ upstream region Log- ratio 5’ upstream region Cluster index Position bias 1 1 1 1 2 2 2 0 0 0 0 2 5’ upstream region Cluster index Co- occurrence 5’ upstream region Down- regulated Up-regulated Cy3/Cy5 log- ratios PAC Rpn4 Yap1 Puf3 Experiment: H 2 O 2 treatment in ΔMsn2/ΔMsn4 background Phase ~ 2,700 periodically expressed genes 0h Time 48h change Similarity to ChIP-chip RAP1 motif (Lee et al, 2002) Mutual information Real mutual information value Maximum of 10,000 expression- shuffled mutual information values 17 motifs in 5’ upstream regions 6 motifs in 3’UTRs 0 “motifs” when shuffling the gene labels of the clustering partition 1129 motifs when applying AlignACE (with default parameters) to each cluster independently 880 “motifs” when applying AlignACE to the same shuffled clusters as above All 23 motifs are highly conserved with S. bayanus > 50% of our predicted motifs have a non-random spatial distribution X Y y P x P y x P y x P Y X I ) ( ) ( ) , ( log ) , ( ) ; ( Mutual Information 21 motifs in 5’ upstream regions 0 motifs in 3’UTRs 0 “motifs” when shuffling the gene labels of the phase profile 71% highly conserved with P. yoelli DNA replication, p<1e-4 plastid, p<0.01 ribosome, p<0.001 Bozdech, Llinás, et al, 2003 Phase motifs informative about the phase ? Yeast: single microarray Biological insights Importance of RNA motifs in shaping transcriptomes (~30% of yeast, worm, human, arabidopsis motifs we found are RNA motifs) In worm/human/mouse, several RNA motifs match miRNA targets “Cooperation” between DNA and RNA motifs Avoidance of joint-presence for certain motifs Under-representation of certain motifs Practical aspects Unix command line: perl fire.pl –expfile=human_clusters.txt –exptype=discrete – species=human FIRE FIRE Human gene expression atlas (clustered) PAC and the Msn2/4 binding site tend to avoid being in the same promoters PAC and RRPE tend be co- localize on the DNA (data from Gasch et al, 2000) Motif conservation with S. bayanus The RAP1 binding site has a position and orientation bias PAC RRPE PUF4 PUF3 MSN2/4 RAP1 RPN4 REB1 MBP1 HAP4 XBP1 BAS1 CBF1 SWI4 73 motifs in 5’ upstream regions 42 motifs in 3’UTRs 0 “motifs” when shuffling the gene labels of the clustering partition ELK4 Sp1 miR-525/mR- 526c bZIP911 NF-Y E2F1 miR-200b/miR- 429 TCF11-MafG Pax2 E2F CHOP-C/EBPα TCF11-MafG (data from Su et al, 2004) Several 3’UTR motifs match the 5’ extremity of microRNAs (Data from Bozdech, Llinás, et al, 2003) FIRE
  • date post

    20-Dec-2015
  • Category

    Documents

  • view

    218
  • download

    2

Transcript of Human: 78 tissues (Su et al, 2004) Stastical significance P. falciparum: intra-erythrocytic...

Page 1: Human: 78 tissues (Su et al, 2004) Stastical significance P. falciparum: intra-erythrocytic development cycle Yeast: 78 co-expression clusters From k-mers.

Human: 78 tissues (Su et al, 2004)

Stastical significanceP. falciparum: intra-erythrocytic development cycle

Yeast: 78 co-expression clusters From k-mers to motifs

Statistical significance

What is FIRE ?FIRE (for Finding Informative Regulatory Elements) is a highly sensitive approach for motif discovery from expression data, based on mutual information. It has the following characteristics:

• highly sensitive, with very few false positive predictions, if any,

• applicable to any type of expression data,

• obviates assumptions and parameter tuning often required by existing methods,

• simultaneously finds DNA and RNA motifs and explores their functional relationships, v) scales well to mammalian genomes,

• highlights the biological role of predicted motifs, their inter-species conservation, and spatial and orientation biases,

• characterizes motif interactions and co-localizations

• displays the results in a user-friendly graphical format.

FIRE uses mutual information to discover and characterize motifs

Systematic exploration of cis-regulation using a generic computational framework

Olivier Elemento*, Noam Slonim* (equal contribution) and Saeed Tavazoie Lewis-Sigler Institute for Integrative Genomics, Princeton University

Discrete Continuous

1

1

1

1

1

2

2

2

0

0

0

0

0.45

0.12

0.01

-0.08

-0.87

-1.56

-2.32

-2.89

1.54

1.98

3.50

4.39

6.45

5’ upstream region

Log-ratio5’ upstream region

Cluster index

Position bias

1

1

1

1

2

2

2

0

0

0

0

2

5’ upstream region

Cluster index

Co-occurrence

5’ upstream region

Down-regulated Up-regulatedCy3/Cy5 log-ratios

PAC

Rpn4

Yap1

Puf3

Experiment: H2O2 treatment in ΔMsn2/ΔMsn4 background

-π Phase +π

~ 2

,70

0 p

eri

od

ically

exp

ress

ed

g

en

es

0h Time 48h

change

Similarity to ChIP-chip RAP1 motif (Lee et al, 2002)

Mutual information

Real mutual information value

Maximum of 10,000 expression-shuffled mutual information values

17 motifs in 5’ upstream regions 6 motifs in 3’UTRs

0 “motifs” when shuffling the gene labels of the clustering partition

1129 motifs when applying AlignACE (with default parameters) to each cluster independently880 “motifs” when applying AlignACE to the same shuffled clusters as above

All 23 motifs are highly conserved with S. bayanus

> 50% of our predicted motifs have a non-random spatial distribution

X Y yPxP

yxPyxPYXI

)()(

),(log),();(

Mutual Information21 motifs in 5’ upstream regions 0 motifs in 3’UTRs0 “motifs” when shuffling the gene labels of the phase profile

71% highly conserved with P. yoelli

DNA replication, p<1e-4plastid, p<0.01

ribosome, p<0.001

Bozdech, Llinás, et al, 2003

-π Phase +π

motifs informative about the phase ?

Yeast: single microarray

Biological insights• Importance of RNA motifs in shaping transcriptomes (~30% of yeast, worm, human, arabidopsis motifs we found are RNA motifs)

• In worm/human/mouse, several RNA motifs match miRNA targets

• “Cooperation” between DNA and RNA motifs

• Avoidance of joint-presence for certain motifs

• Under-representation of certain motifs

Practical aspectsUnix command line:

perl fire.pl –expfile=human_clusters.txt –exptype=discrete –species=human

FIRE FIRE

Human gene expression atlas (clustered)

PAC and the Msn2/4 binding site tend to avoid being in the same promoters

PAC and RRPE tend be co-localize on the DNA

(data from Gasch et al, 2000)

Motif conservation with S. bayanus

The RAP1 binding site has a position and orientation bias

PAC

RRPE

PUF4

PUF3

MSN2/4

RAP1

RPN4

REB1

MBP1

HAP4

XBP1

BAS1

CBF1

SWI4

73 motifs in 5’ upstream regions 42 motifs in 3’UTRs

0 “motifs” when shuffling the gene labels of the clustering partition

ELK4

Sp1

miR-525/mR-526c

bZIP911

NF-Y

E2F1

miR-200b/miR-429

TCF11-MafG

Pax2

E2F

CHOP-C/EBPα

TCF11-MafG

(data from Su et al, 2004)

Several 3’UTR motifs match the 5’ extremity of microRNAs

(Data from Bozdech, Llinás, et al, 2003)

FIRE