Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of...

75
Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 1 / 30

Transcript of Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of...

Page 1: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

Use and Complexity of existing RNA-tools

M. Marz

University of Leipzig

Tianjin, China09.11.2009

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 1 / 30

Page 2: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

Evolution of most important ncRNAs in biological networks

CHOANOFLAGELLATAANIMALIA

FUNGI

AMOEBOZOA

PLANTAERHODOPHYTAHETEROKONTAAPICOMPLEXACILIATESKINETOPLASTIDAEUGLENOZOAMETAMONADA

NANOARCHAEOTACRENARCHAEOTAEURYARCHAEOTA

PROTEOBACTERIACHLAMYDIA

VertebrataUrochordataCephalochordataEchinodermataHemichordata

NematodaArthropodaPlatyhelminthesAnnelidaMolluscaCnidariaPorifera

SmY

RNase P

ACTINOBACTERIACYANOBACTERIAFIRMICUTES

RNAi

telomerase−RNA

snoRNAsTaphrinomycotinaSaccharomycotinaPezizomycotinaBasidomycotaGlomeromycoyaChytridiomycoyaMicrosporidia

AngiospermsConiferalesBryophyta CharalesChlorphyta

LUCA

U7microRNAmechamism

Minor snRNAs

miRNAs

vault

Y RNA

miRNAs

miRNAs

Yfr1

tmRNA6S

SRP

rRNA

gRNAs

Major snRNAs

SL RNA ?

miRNAs

tRNA

RNase MRP

7SK

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 2 / 30

Page 3: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

Protein-coding Genes

�����������������������������������

�����������������������������������

����������������������������

����������������������������

5’ 3’

5’ 3’

5’ 3’

CAPAAA

AS

ASAS AS

AS

AS

ASAS

AS

AS

ASAS AS

AS

AS

ASAS

AS

AS

7SK

U7

U4

rRNA

tRNA

CAP

AAA

AAA

Pol II

Histone

TATA

Enhancer

Chromosome

pre−mRNA

mRNA

NUCLEUS

CYTOPLASM

Ribosome

Intron

miRNA

CAP

CAP

Exon

Protein

DNA

RNA

U1U2

U5U6

(SL)

Spliceosome

export

Tra

nscr

iptio

nP

roce

ssin

gT

rans

latio

n

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 3 / 30

Page 4: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

Non-(protein)-coding Genes

�����������������������������������

�����������������������������������

����������������������������

����������������������������

(AAA)5’ 3’

pre−ncRNA

ncRNA

action

action

action

action

(Pol II/Pol III)

Histone

(TATA)

(Enhancer)

Chromosome

NUCLEUS

CYTOPLASM

(CAP)

(CAP)

Tra

nscr

iptio

nP

roce

ssin

g

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 4 / 30

Page 5: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

Programs for Homology Search

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 5 / 30

Page 6: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

Programs for Homology Search

How to choose from 86 programs?

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 5 / 30

Page 7: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

Programs for Homology Search

How to choose from 86 programs?

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 5 / 30

Page 8: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

Pipeline for Homology Search

Sequence conserved search

Structural conserved search

Pattern andStructure search

Genomwide ncRNA Search

Ensembl comparaBiofuice

SyntenyConservation

Blast

GotohScan RNAmotif

Hypa

Infernal rnabob

fragrep

RNAfold −C

yes

yes yes

yes

yes

yes

yes

yes

no

no

no

no

no

no

no

no

Maybe absent?

Multiple copies?

General Homology Search(known RNA needed)

tRNAscan−SE

SRP−scan

Bcheck

RNAmicro

snoReport

no

no

no

no

Specific Programs(no RNA input)

yes

yes

yes

yes

yes

yes

blastclust (upstream/downstream)

no

(Pseudogenes, Assembly copies)Remove duplicates

rnabob

Promotersearch

(known

no

MEME(unkownpromoter)

promoter)

Clustalw

Locarnate

Ralee mode,RNAsuboptRNAduplex

no

no

Manual Analysis:

MultipleAlignment

SnoplexRNAup

RNAduplex(RIP)

TargetPrediction

no

Synblast

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 6 / 30

Page 9: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

BLAST 1

Sequence based local alignments(blastn, blastp, blastx, tblastn, PSI-blast)

index based databases (NCBI, Rfam, Noncode, ...)

1Altschul et al (1990)M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 7 / 30

Page 10: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

BLAST 1

Sequence based local alignments(blastn, blastp, blastx, tblastn, PSI-blast)

index based databases (NCBI, Rfam, Noncode, ...)

heuristic Smith-Waterman algorithm

Fi ,j = max

0,

Fi−1,j−1 + σ(pi , qj ),

Fi−1,j − d ,

Fi ,j−1 − d

1Altschul et al (1990)M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 7 / 30

Page 11: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

BLAST 1

Sequence based local alignments(blastn, blastp, blastx, tblastn, PSI-blast)

index based databases (NCBI, Rfam, Noncode, ...)

heuristic Smith-Waterman algorithm

Fi ,j = max

0,

Fi−1,j−1 + σ(pi , qj ),

Fi−1,j − d ,

Fi ,j−1 − d

seed

11nt (blastn), 28nt (megablast), 3aa (other programs)

insertions/deletions

constant costs per nucleotide/amino acid1Altschul et al (1990)

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 7 / 30

Page 12: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

GoTohScan 2

full dynamic programming approach

semi-global alignment

affine gap costs for long insertions/deletions

2Hertel et al (2009)M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 8 / 30

Page 13: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

GoTohScan 2

full dynamic programming approach

semi-global alignment

affine gap costs for long insertions/deletions

Dij = max {Si−1,j + γo ,Di−1,j + γe}

Fij = max {Si ,j−1 + γo ,Fi ,j−1 + γe}

Sij = max {Dij ,Fij ,Si−1,j−1 + σ(pi , qj )}

100 150 200alignment score

0

1

2

3

4

5

6

log(

# al

ignm

ents

)

U4atac

150 200 250alignment score

U17 snoRNA

150 200alignment score

RNAse MRP

2Hertel et al (2009)M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 8 / 30

Page 14: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

GoTohScan 2

full dynamic programming approach

semi-global alignment

affine gap costs for long insertions/deletions

Dij = max {Si−1,j + γo ,Di−1,j + γe}

Fij = max {Si ,j−1 + γo ,Fi ,j−1 + γe}

Sij = max {Dij ,Fij ,Si−1,j−1 + σ(pi , qj )}

100 150 200alignment score

0

1

2

3

4

5

6

log(

# al

ignm

ents

)

U4atac

150 200 250alignment score

U17 snoRNA

150 200alignment score

RNAse MRP

Slow: O(n × m) time and memory2Hertel et al (2009)

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 8 / 30

Page 15: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

Genomic Context

Genome Browser (Ensembl, UCSC, flybase, wormbase, ...)

1.60 Mb 1.70 Mb 1.80 Mb 1.90 Mb 2.00 Mb 2.10 Mb 2.20 Mb 2.30 Mb 2.40 Mb 2.50 Mb

p13.3Chromosome bands

< AC130689.8.1.202269 < AC090617.16.1.204630 < AC015799.23.1.180157Contigs

TLCD2

C17orf91

AC130689.8

AC130689.8

RTN4RL1

DPH1

HIC1

SMG6

TSR1

SGSM2

AC006435.7

METT10D

PAFAH1B1

PRPF8

WDR81

SERPINF2

SERPINF1

SMYD4 RPA1

OVCA2

SRR

MNT

Ensembl/Havana g...

hsa-mir-22 hsa-mir-132

hsa-mir-212

SNORD91

SNORD91

AC015799.23

SRP_euk_arch

AC015799.23

AC005696.1

SRP_euk_arch

AC005696.1

ncRNA gene

1.60 Mb 1.70 Mb 1.80 Mb 1.90 Mb 2.00 Mb 2.10 Mb 2.20 Mb 2.30 Mb 2.40 Mb 2.50 Mb

Ensembl Homo sapiens version 53.36o (NCBI36) Chromosome 17: 1,531,851 - 2,531,850

1.00 Mb Forward strand

3Thompson et al (1994)4Lehmann et al. (2008)

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 9 / 30

Page 16: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

Genomic Context

Genome Browser (Ensembl, UCSC, flybase, wormbase, ...)

1.60 Mb 1.70 Mb 1.80 Mb 1.90 Mb 2.00 Mb 2.10 Mb 2.20 Mb 2.30 Mb 2.40 Mb 2.50 Mb

p13.3Chromosome bands

< AC130689.8.1.202269 < AC090617.16.1.204630 < AC015799.23.1.180157Contigs

TLCD2

C17orf91

AC130689.8

AC130689.8

RTN4RL1

DPH1

HIC1

SMG6

TSR1

SGSM2

AC006435.7

METT10D

PAFAH1B1

PRPF8

WDR81

SERPINF2

SERPINF1

SMYD4 RPA1

OVCA2

SRR

MNT

Ensembl/Havana g...

hsa-mir-22 hsa-mir-132

hsa-mir-212

SNORD91

SNORD91

AC015799.23

SRP_euk_arch

AC015799.23

AC005696.1

SRP_euk_arch

AC005696.1

ncRNA gene

1.60 Mb 1.70 Mb 1.80 Mb 1.90 Mb 2.00 Mb 2.10 Mb 2.20 Mb 2.30 Mb 2.40 Mb 2.50 Mb

Ensembl Homo sapiens version 53.36o (NCBI36) Chromosome 17: 1,531,851 - 2,531,850

1.00 Mb Forward strand

Information from close related species

3Thompson et al (1994)4Lehmann et al. (2008)

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 9 / 30

Page 17: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

Genomic Context

Genome Browser (Ensembl, UCSC, flybase, wormbase, ...)

1.60 Mb 1.70 Mb 1.80 Mb 1.90 Mb 2.00 Mb 2.10 Mb 2.20 Mb 2.30 Mb 2.40 Mb 2.50 Mb

p13.3Chromosome bands

< AC130689.8.1.202269 < AC090617.16.1.204630 < AC015799.23.1.180157Contigs

TLCD2

C17orf91

AC130689.8

AC130689.8

RTN4RL1

DPH1

HIC1

SMG6

TSR1

SGSM2

AC006435.7

METT10D

PAFAH1B1

PRPF8

WDR81

SERPINF2

SERPINF1

SMYD4 RPA1

OVCA2

SRR

MNT

Ensembl/Havana g...

hsa-mir-22 hsa-mir-132

hsa-mir-212

SNORD91

SNORD91

AC015799.23

SRP_euk_arch

AC015799.23

AC005696.1

SRP_euk_arch

AC005696.1

ncRNA gene

1.60 Mb 1.70 Mb 1.80 Mb 1.90 Mb 2.00 Mb 2.10 Mb 2.20 Mb 2.30 Mb 2.40 Mb 2.50 Mb

Ensembl Homo sapiens version 53.36o (NCBI36) Chromosome 17: 1,531,851 - 2,531,850

1.00 Mb Forward strand

Information from close related species

Alignment: ClustalW 3/ClustalX

3Thompson et al (1994)4Lehmann et al. (2008)

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 9 / 30

Page 18: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

Genomic Context

Genome Browser (Ensembl, UCSC, flybase, wormbase, ...)

1.60 Mb 1.70 Mb 1.80 Mb 1.90 Mb 2.00 Mb 2.10 Mb 2.20 Mb 2.30 Mb 2.40 Mb 2.50 Mb

p13.3Chromosome bands

< AC130689.8.1.202269 < AC090617.16.1.204630 < AC015799.23.1.180157Contigs

TLCD2

C17orf91

AC130689.8

AC130689.8

RTN4RL1

DPH1

HIC1

SMG6

TSR1

SGSM2

AC006435.7

METT10D

PAFAH1B1

PRPF8

WDR81

SERPINF2

SERPINF1

SMYD4 RPA1

OVCA2

SRR

MNT

Ensembl/Havana g...

hsa-mir-22 hsa-mir-132

hsa-mir-212

SNORD91

SNORD91

AC015799.23

SRP_euk_arch

AC015799.23

AC005696.1

SRP_euk_arch

AC005696.1

ncRNA gene

1.60 Mb 1.70 Mb 1.80 Mb 1.90 Mb 2.00 Mb 2.10 Mb 2.20 Mb 2.30 Mb 2.40 Mb 2.50 Mb

Ensembl Homo sapiens version 53.36o (NCBI36) Chromosome 17: 1,531,851 - 2,531,850

1.00 Mb Forward strand

Information from close related species

Alignment: ClustalW 3/ClustalX

Synblast4

3Thompson et al (1994)4Lehmann et al. (2008)

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 9 / 30

Page 19: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

Genomic Context

Genome Browser (Ensembl, UCSC, flybase, wormbase, ...)

1.60 Mb 1.70 Mb 1.80 Mb 1.90 Mb 2.00 Mb 2.10 Mb 2.20 Mb 2.30 Mb 2.40 Mb 2.50 Mb

p13.3Chromosome bands

< AC130689.8.1.202269 < AC090617.16.1.204630 < AC015799.23.1.180157Contigs

TLCD2

C17orf91

AC130689.8

AC130689.8

RTN4RL1

DPH1

HIC1

SMG6

TSR1

SGSM2

AC006435.7

METT10D

PAFAH1B1

PRPF8

WDR81

SERPINF2

SERPINF1

SMYD4 RPA1

OVCA2

SRR

MNT

Ensembl/Havana g...

hsa-mir-22 hsa-mir-132

hsa-mir-212

SNORD91

SNORD91

AC015799.23

SRP_euk_arch

AC015799.23

AC005696.1

SRP_euk_arch

AC005696.1

ncRNA gene

1.60 Mb 1.70 Mb 1.80 Mb 1.90 Mb 2.00 Mb 2.10 Mb 2.20 Mb 2.30 Mb 2.40 Mb 2.50 Mb

Ensembl Homo sapiens version 53.36o (NCBI36) Chromosome 17: 1,531,851 - 2,531,850

1.00 Mb Forward strand

Information from close related species

Alignment: ClustalW 3/ClustalX

Synblast4, other Special Synteny Programs

cel 10

cel 13

cel 11

cre 12

cre 13

cre 23 cre 27 cre 29

3Thompson et al (1994)4Lehmann et al. (2008)

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 9 / 30

Page 20: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

Genomic Context

Genome Browser (Ensembl, UCSC, flybase, wormbase, ...)

1.60 Mb 1.70 Mb 1.80 Mb 1.90 Mb 2.00 Mb 2.10 Mb 2.20 Mb 2.30 Mb 2.40 Mb 2.50 Mb

p13.3Chromosome bands

< AC130689.8.1.202269 < AC090617.16.1.204630 < AC015799.23.1.180157Contigs

TLCD2

C17orf91

AC130689.8

AC130689.8

RTN4RL1

DPH1

HIC1

SMG6

TSR1

SGSM2

AC006435.7

METT10D

PAFAH1B1

PRPF8

WDR81

SERPINF2

SERPINF1

SMYD4 RPA1

OVCA2

SRR

MNT

Ensembl/Havana g...

hsa-mir-22 hsa-mir-132

hsa-mir-212

SNORD91

SNORD91

AC015799.23

SRP_euk_arch

AC015799.23

AC005696.1

SRP_euk_arch

AC005696.1

ncRNA gene

1.60 Mb 1.70 Mb 1.80 Mb 1.90 Mb 2.00 Mb 2.10 Mb 2.20 Mb 2.30 Mb 2.40 Mb 2.50 Mb

Ensembl Homo sapiens version 53.36o (NCBI36) Chromosome 17: 1,531,851 - 2,531,850

1.00 Mb Forward strand

Information from close related species

Alignment: ClustalW 3/ClustalX

Synblast4, other Special Synteny Programs

cel 10

cel 13

cel 11

cre 12

cre 13

cre 23 cre 27 cre 29

3Thompson et al (1994)4Lehmann et al. (2008)

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 9 / 30

Page 21: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

Flanking region/Promoter and TFBS search

Motif search: rnabob 5,

5Eddy (1992)6Bailey & Elkan (1994)7Prohaska, in prep.

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 10 / 30

Page 22: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

Flanking region/Promoter and TFBS search

Motif search: rnabob 5,

New motif: MEME 6,

5Eddy (1992)6Bailey & Elkan (1994)7Prohaska, in prep.

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 10 / 30

Page 23: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

Flanking region/Promoter and TFBS search

Motif search: rnabob 5,

New motif: MEME 6,

Enhancer elements:Tracker7

A D

B C

E

5Eddy (1992)6Bailey & Elkan (1994)7Prohaska, in prep.

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 10 / 30

Page 24: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

Flanking region/Promoter and TFBS search

Motif search: rnabob 5,

New motif: MEME 6,

Enhancer elements:Tracker7

A D

B C

E

TFBS: Transfac (commercial)

5Eddy (1992)6Bailey & Elkan (1994)7Prohaska, in prep.

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 10 / 30

Page 25: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

Flanking region/Promoter and TFBS search

Motif search: rnabob 5,

New motif: MEME 6,

Enhancer elements:Tracker7

A D

B C

E

TFBS: Transfac (commercial) Polymerase II/III transcriptIdentification of assembly artefactsPrediction of pseudogenes (!)

5Eddy (1992)6Bailey & Elkan (1994)7Prohaska, in prep.

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 10 / 30

Page 26: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

Sequence vs. Structure

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 11 / 30

Page 27: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

Sequence vs. StructureExample: U12 snRNA of C. capitata and X. tropicalis (nt 25-78)

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 11 / 30

Page 28: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

Sequence vs. StructureExample: U12 snRNA of C. capitata and X. tropicalis (nt 25-78)

AUGC

CU

UAAA

CUAAUG

A G UAAGGAAAAUAAUGAGUCCUG

GUGA

C GC G G G G C U C

CC

AG

GUUCA

C UAU

CC

UG

GACGAAUUUCUGAGAG G G C UCA G G U C G U

CC G U G GGG U G G C C C G C

C U ACU

UUUGCGGGCUGCCCGCGU

UGUAGCGAUCUGC

CCGA

GCCC

C. capitata U12 snRNA

UGCC

UU

AAA

CUAAUG

A G UAAGGAAAAUAACAAACCAGG

GUGA

U GC C U G G U U U

AU

UC

ACU

AC U

UG

UG

AAAUGAAUUUUU

GAGC A G G UACA G G C C U U

CC C U U GCA G G U U C U A U

C UAC

UUUGUGGGACCGUGAGGU

GCACUGGACUGCCUG

X. tropicalis U12 snRNA

RNAfold a,

aHofacker (2003)bHofacker (2003)

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 11 / 30

Page 29: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

Sequence vs. StructureExample: U12 snRNA of C. capitata and X. tropicalis (nt 25-78)

AUGC

CU

UAAA

CUAAUG

A G UAAGGAAAAUAAUGAGUCCUG

GUGA

C GC G G G G C U C

CC

AG

GUUCA

C UAU

CC

UG

GACGAAUUUCUGAGAG G G C UCA G G U C G U

CC G U G GGG U G G C C C G C

C U ACU

UUUGCGGGCUGCCCGCGU

UGUAGCGAUCUGC

CCGA

GCCC

C. capitata U12 snRNA

UGCC

UU

AAA

CUAAUG

A G UAAGGAAAAUAACAAACCAGG

GUGA

U GC C U G G U U U

AU

UC

ACU

AC U

UG

UG

AAAUGAAUUUUU

GAGC A G G UACA G G C C U U

CC C U U GCA G G U U C U A U

C UAC

UUUGUGGGACCGUGAGGU

GCACUGGACUGCCUG

X. tropicalis U12 snRNA

_UGC

CU

UAA

ACU

AAU G

AG U

AAGGAAAAUAACAAACCAGG

GUGA

C GC C G G G C U C

_C

CA

AC__

CA

C UA_

CC

GA

AACGAAUUUCUGAG_C A G C U

_C A G G C C G U

C _C C U G G CAG G G C C C A C

C U ACU

UU_GCGGGACCCCCA_CG

UGC_ACCGAACUG_

____CCCC _U

AG

CG

CU

Alignment of C. capitata and X. tropicalis U12 snRNA

RNAfold a, RNAalifold b

aHofacker (2003)bHofacker (2003)

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 11 / 30

Page 30: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

Sequence vs. StructureExample: U12 snRNA of C. capitata and X. tropicalis (nt 25-78)

AUGC

CU

UAAA

CUAAUG

A G UAAGGAAAAUAAUGAGUCCUG

GUGA

C GC G G G G C U C

CC

AG

GUUCA

C UAU

CC

UG

GACGAAUUUCUGAGAG G G C UCA G G U C G U

CC G U G GGG U G G C C C G C

C U ACU

UUUGCGGGCUGCCCGCGU

UGUAGCGAUCUGC

CCGA

GCCC

C. capitata U12 snRNA

UGCC

UU

AAA

CUAAUG

A G UAAGGAAAAUAACAAACCAGG

GUGA

U GC C U G G U U U

AU

UC

ACU

AC U

UG

UG

AAAUGAAUUUUU

GAGC A G G UACA G G C C U U

CC C U U GCA G G U U C U A U

C UAC

UUUGUGGGACCGUGAGGU

GCACUGGACUGCCUG

X. tropicalis U12 snRNA

_UGC

CU

UAA

ACU

AAU G

AG U

AAGGAAAAUAACAAACCAGG

GUGA

C GC C G G G C U C

_C

CA

AC__

CA

C UA_

CC

GA

AACGAAUUUCUGAG_C A G C U

_C A G G C C G U

C _C C U G G CAG G G C C C A C

C U ACU

UU_GCGGGACCCCCA_CG

UGC_ACCGAACUG_

____CCCC _U

AG

CG

CU

Alignment of C. capitata and X. tropicalis U12 snRNA

RNAfold a, RNAalifold b

aHofacker (2003)bHofacker (2003)

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 11 / 30

Page 31: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

Precasted specific RNA Finder

tRNAscan-SE8

BRUCE9 (tmRNAs)

Bcheck10 (RNase P)

SRPRNA11

8Lowe & Eddy (1997)9Laslett et al. (2002)

10Yusuf et al. (in prep.)11Regalia et al. (2002)

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 12 / 30

Page 32: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

Precasted specific RNA Finder

tRNAscan-SE8

BRUCE9 (tmRNAs)

Bcheck10 (RNase P)

SRPRNA11

No query neededUsually for whole genomes

8Lowe & Eddy (1997)9Laslett et al. (2002)

10Yusuf et al. (in prep.)11Regalia et al. (2002)

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 12 / 30

Page 33: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

Structure Based Search Programs

Erpin12

Infernal13

U3 snoRNA Bitscore 123.10 [6,218]

:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::((((−−−−−−−−−−−−−−((((((((((((,,,,AaGACcaUACUUUGAAcAGGAUCauUUCUAUAGgaUauuaCuauuaaauUuuaucuaaAAguAGacAagaaccuAAACCcgGAuGAuGAgauauggCcuugucgcCcGAGCAAGAC+ UACUUU AGGAUCAUUUCUAUAG+A A C+ +U ++U UU UC AAAG AGACAA C U AACC: GA GA GA +AU+:C:UU: ::CC:GAGCAAGACUGUACUUU−−CUAGGAUCAUUUCUAUAGUACACGUCCCGUCUUUCUUCUC−CAAAGAAGACAACCGCAUCAACCAUGAGGAGGAUUAAUAACGUUCUUUCCUGAGC

,,,,,,,,<<<<−−<<<<<<<<<_____>>>>>−>>>>−−>>>>,,,,,,,,,,,,<<<<<.−<<<<__...__>>>>−.>>>>>))))))))))))−−−−−−−))))GUGAaguagccgccgggcgcugCuUuuuGcagcugcccuucggcaUaGAUGAuCGUuCccg.cccccUu...uugggga.cggGagGgcgacaagGcugUCUGAcgGGG GAAG G C + :::: :U:C U UG:A: ::::U G CAU+GAUGA CGUUC:CG + ::CU+ G:: + CG:GA:GG:: :AA:G:++UCUGA :GGGGGAAGCGGGCGA−UAUUGUUCCAGUCUGGAAU−GAUAU−UGUCAUUGAUGACCGUUCUCGuUGUACUAuugCAGUAUUuCGGGAGGGAAGGAACGUAUUCUGAGUGG

Trichoplax adhaerens U3 snoRNA, bitscore 123.10.

12Gautheret & Lambert (2001)13Nawrocki et al (2009)

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 13 / 30

Page 34: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

Structure Based Search Programs

Erpin12

Infernal13

U3 snoRNA Bitscore 123.10 [6,218]

:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::((((−−−−−−−−−−−−−−((((((((((((,,,,AaGACcaUACUUUGAAcAGGAUCauUUCUAUAGgaUauuaCuauuaaauUuuaucuaaAAguAGacAagaaccuAAACCcgGAuGAuGAgauauggCcuugucgcCcGAGCAAGAC+ UACUUU AGGAUCAUUUCUAUAG+A A C+ +U ++U UU UC AAAG AGACAA C U AACC: GA GA GA +AU+:C:UU: ::CC:GAGCAAGACUGUACUUU−−CUAGGAUCAUUUCUAUAGUACACGUCCCGUCUUUCUUCUC−CAAAGAAGACAACCGCAUCAACCAUGAGGAGGAUUAAUAACGUUCUUUCCUGAGC

,,,,,,,,<<<<−−<<<<<<<<<_____>>>>>−>>>>−−>>>>,,,,,,,,,,,,<<<<<.−<<<<__...__>>>>−.>>>>>))))))))))))−−−−−−−))))GUGAaguagccgccgggcgcugCuUuuuGcagcugcccuucggcaUaGAUGAuCGUuCccg.cccccUu...uugggga.cggGagGgcgacaagGcugUCUGAcgGGG GAAG G C + :::: :U:C U UG:A: ::::U G CAU+GAUGA CGUUC:CG + ::CU+ G:: + CG:GA:GG:: :AA:G:++UCUGA :GGGGGAAGCGGGCGA−UAUUGUUCCAGUCUGGAAU−GAUAU−UGUCAUUGAUGACCGUUCUCGuUGUACUAuugCAGUAUUuCGGGAGGGAAGGAACGUAUUCUGAGUGG

Trichoplax adhaerens U3 snoRNA, bitscore 123.10.

Query dependendNo information about structure as input

12Gautheret & Lambert (2001)13Nawrocki et al (2009)

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 13 / 30

Page 35: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

Support Vector Machines: SnoReport14

14Hertel & Stadler (2008)M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 14 / 30

Page 36: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

Support Vector Machines: SnoReport14

MFE

z-score

GC-content

Box scores and distances

Stems and lengths

Loops and lengths

14Hertel & Stadler (2008)M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 14 / 30

Page 37: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

Support Vector Machines: SnoReport14

MFE

z-score

GC-content

Box scores and distances

Stems and lengths

Loops and lengths

SVM

5. extractfeatures

HACA: SE=78% SP=89%CD: SE=87% SP=95%

Input:sequencessingle

2. truncate sequence

fold

4. check structure

reject

1. find and score motifs

putativeCD / HACA snoRNA

6. if (P > 0.5)

3. create constraint

scor

e >

thre

shol

d

Model:

other ncRNAs− HACA / CD snoRNAs

+ CD / HACA snoRNAs

mfe, z−score, GC−contentBox scores + distancesStem and loop length(s)

14Hertel & Stadler (2008)M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 14 / 30

Page 38: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

Support Vector Machines: SnoReport14

MFE

z-score

GC-content

Box scores and distances

Stems and lengths

Loops and lengths

SVM

5. extractfeatures

HACA: SE=78% SP=89%CD: SE=87% SP=95%

Input:sequencessingle

2. truncate sequence

fold

4. check structure

reject

1. find and score motifs

putativeCD / HACA snoRNA

6. if (P > 0.5)

3. create constraint

scor

e >

thre

shol

d

Model:

other ncRNAs− HACA / CD snoRNAs

+ CD / HACA snoRNAs

mfe, z−score, GC−contentBox scores + distancesStem and loop length(s)

Input: single sequences; whole genomes possible

14Hertel & Stadler (2008)M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 14 / 30

Page 39: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

Support Vector Machines: RNAmicro15

MFE

z-score

GC-content

best 23nt block

Stems and lengths

Loops and lengths

15Hertel et al. (2006)M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 15 / 30

Page 40: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

Support Vector Machines: RNAmicro15

MFE

z-score

GC-content

best 23nt block

Stems and lengths

Loops and lengths

− other ncRNA alignments shuffled miRNA alignments

+ miRNA alignmentsInput:sequencesaligned

SVM2. extractfeatures

1. checkstructure

Model:

reject

SE=84% SP=99%alifold

putative miRNAprecursor

3. if (P > 0.5)

15Hertel et al. (2006)M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 15 / 30

Page 41: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

Support Vector Machines: RNAmicro15

MFE

z-score

GC-content

best 23nt block

Stems and lengths

Loops and lengths

− other ncRNA alignments shuffled miRNA alignments

+ miRNA alignmentsInput:sequencesaligned

SVM2. extractfeatures

1. checkstructure

Model:

reject

SE=84% SP=99%alifold

putative miRNAprecursor

3. if (P > 0.5)

Input: multiple sequence alignments; multiple genomes also possible

15Hertel et al. (2006)M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 15 / 30

Page 42: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

Support Vector Machines: RNAz16

SCI

Meanwise pairwise identity

Number of sequences

Average z-Score

16Washietl et al. (2005)M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 16 / 30

Page 43: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

Support Vector Machines: RNAz16

SCI

Meanwise pairwise identity

Number of sequences

Average z-Score

u3sc01 sc03 mir5StRNA 1384 1249

Ciona intestinalis – known and new predicted ncRNAs by RNAz.

16Washietl et al. (2005)M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 16 / 30

Page 44: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

Support Vector Machines: RNAz16

SCI

Meanwise pairwise identity

Number of sequences

Average z-Score

u3sc01 sc03 mir5StRNA 1384 1249

Ciona intestinalis – known and new predicted ncRNAs by RNAz.

Mainly alignment dependentMany false positives

16Washietl et al. (2005)M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 16 / 30

Page 45: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

ci_558117 ***ci_555438 ***ci_554296 ci_557698 ci_555929 ***ci_554730 ***ci_555491 ***ci_554599 ***ci_556562 ***ci_555236 ***ci_554528 ***ci_555486 ci_557864

ci_556204 ***ci_556966 ***

ci_557168 ci_556973 ***ci_556971 ***ci_556968 ***ci_556955 ***ci_554931 ***ci_557471 ci_557305 ***ci_555637 ci_556275 ***ci_556105 ***ci_555312 ***ci_556276 ***ci_555555 ***ci_554842 ***ci_554683 ***ci_554678 ci_554324 ***ci_554354 ***ci_557087 ***ci_555122 ci_555447 ***ci_556560 ***ci_555756 ***ci_554903 ***ci_555970-5Sci_555994 ***ci_557058 ci_555492 ***ci_554321 ***ci_556663 ***ci_556021 ***ci_555550 ci_556949 ci_555833 ***ci_555828 ***ci_555456 ci_557837-sc19ci_555813 ***ci_554098 ***ci_554384 ***ci_555508 ci_554681

AGGG_CC_

AA

UAA

AA A

GUU

UC

GAAG

CUGC_

_GA

GG_

UUGCA

AC

CAAA

_C

ACCG__

U_CA

AC _ U A

UAUC

AGG

AAU

_G

UUGA

_U

AAUA__UC _A

_A_

__A

CA

_UCGC U

GC

UGC _ CA

AUG_AA

CAUCG

AUCCGA

CG

CAGGUU

CG

CAUG

CG

___UAUUG

AA

AC

UAUA

ACAC

alidot.ps

A G G G _ C C _ A A U A A A A A G U U U C G A A G C U G C _ _ G A G G _ U U G C A A C C A A A _ C A C C G _ _ U _ C A A C _ U A U A U C A G G A A U _ G U U G A _ U A A U A _ _ U C _ A _ A _ _ _ A C A _ U C G C U G C U G C _ C A A U G _ A A C A U C G A U C C G A

A G G G _ C C _ A A U A A A A A G U U U C G A A G C U G C _ _ G A G G _ U U G C A A C C A A A _ C A C C G _ _ U _ C A A C _ U A U A U C A G G A A U _ G U U G A _ U A A U A _ _ U C _ A _ A _ _ _ A C A _ U C G C U G C U G C _ C A A U G _ A A C A U C G A U C C G AAG

GG

_C

C_

AA

UA

AA

AA

GU

UU

CG

AA

GC

UG

C_

_G

AG

G_

UU

GC

AA

CC

AA

A_

CA

CC

G_

_U

_C

AA

C_

UA

UA

UC

AG

GA

AU

_G

UU

GA

_U

AA

UA

__

UC

_A

_A

__

_A

CA

_U

CG

CU

GC

UG

C_

CA

AU

G_

AA

CA

UC

GA

UC

CG

A

AG

GG

_C

C_

AA

UA

AA

AA

GU

UU

CG

AA

GC

UG

C_

_G

AG

G_

UU

GC

AA

CC

AA

A_

CA

CC

G_

_U

_C

AA

C_

UA

UA

UC

AG

GA

AU

_G

UU

GA

_U

AA

UA

__

UC

_A

_A

__

_A

CA

_U

CG

CU

GC

UG

C_

CA

AU

G_

AA

CA

UC

GA

UC

CG

A

cluster152 N=6 MPI=26.40 SCI=0.42

alidot.ps

_ _ U _ G _ G _ G _ A _ G A U _ G _ _ A _ G _ A U G A U G U A U G _ A U U _ U G G C _ C A U A U C A G U _ U U A _ U C _ U G U _ A U A A A A _ A A G A U G A A _ C U G U A G _ U U G C A _ A _ A A U U C C A _ A A U G C G U A _ _ _ U _ G U A C C A U A

_ _ U _ G _ G _ G _ A _ G A U _ G _ _ A _ G _ A U G A U G U A U G _ A U U _ U G G C _ C A U A U C A G U _ U U A _ U C _ U G U _ A U A A A A _ A A G A U G A A _ C U G U A G _ U U G C A _ A _ A A U U C C A _ A A U G C G U A _ _ _ U _ G U A C C A U A__

U_

G_

G_

G_

A_

GA

U_

G_

_A

_G

_A

UG

AU

GU

AU

G_

AU

U_

UG

GC

_C

AU

AU

CA

GU

_U

UA

_U

C_

UG

U_

AU

AA

AA

_A

AG

AU

GA

A_

CU

GU

AG

_U

UG

CA

_A

_A

AU

UC

CA

_A

AU

GC

GU

A_

__

U_

GU

AC

CA

UA

__

U_

G_

G_

G_

A_

GA

U_

G_

_A

_G

_A

UG

AU

GU

AU

G_

AU

U_

UG

GC

_C

AU

AU

CA

GU

_U

UA

_U

C_

UG

U_

AU

AA

AA

_A

AG

AU

GA

A_

CU

GU

AG

_U

UG

CA

_A

_A

AU

UC

CA

_A

AU

GC

GU

A_

__

U_

GU

AC

CA

UA

__U_G_G_

G_

A_GAU_G

__ A _ G _ A

UGAU

GUAU

G _ AUU

_U

GGC

_CA U

AU

CAG

U_

UUA_UC

_UG U

_A

U AA

AA

_A

AGAUGA

A_

CUGU

AG

_U

UGCA

_A_

AAUU

CCA_

AAUG

CGU A _

__U_GUAC

CA

UA

UAGU

GA

UA

AU

AA

UA

UAAU

A_

cluster107 N=12 MPI=21.10 SCI=0.29alidot.ps

G C U A U U C U U _ C A _ _ A U _ U U U U A C A _ U A G _ _ A U G _ G U U U U A U G _ G A _ C U G G C U A U U U A U A G A U A A _ A A G _ C U G _ G C _ U A U G _ A U G A A _ G U C A _ C G A A A _ _ U A A U G _ C _ _ G U C _ _ A _ C A _ _ _ U U G A

G C U A U U C U U _ C A _ _ A U _ U U U U A C A _ U A G _ _ A U G _ G U U U U A U G _ G A _ C U G G C U A U U U A U A G A U A A _ A A G _ C U G _ G C _ U A U G _ A U G A A _ G U C A _ C G A A A _ _ U A A U G _ C _ _ G U C _ _ A _ C A _ _ _ U U G AGC

UA

UU

CU

U_

CA

__

AU

_U

UU

UA

CA

_U

AG

__

AU

G_

GU

UU

UA

UG

_G

A_

CU

GG

CU

AU

UU

AU

AG

AU

AA

_A

AG

_C

UG

_G

C_

UA

UG

_A

UG

AA

_G

UC

A_

CG

AA

A_

_U

AA

UG

_C

__

GU

C_

_A

_C

A_

__

UU

GA

GC

UA

UU

CU

U_

CA

__

AU

_U

UU

UA

CA

_U

AG

__

AU

G_

GU

UU

UA

UG

_G

A_

CU

GG

CU

AU

UU

AU

AG

AU

AA

_A

AG

_C

UG

_G

C_

UA

UG

_A

UG

AA

_G

UC

A_

CG

AA

A_

_U

AA

UG

_C

__

GU

C_

_A

_C

A_

__

UU

GA

GCUA

UU

CUU_CA

__

A U _ UUUUACA

_UAG_

_AUG

_ GUUUUAUG

_GA_

CUG

GCUA

UUUA

U A G AUAA

_AAG _

CUG

_GC

_UAUG_AUGA A _

GUC

A_C

GA

AA__UAAUG _ C _ _

GU

C__A_

CA

__

_UUGA

GUUUAUAUUAACAA

GUCA

AGGUUUAUGUUA

UG

CGGA

GGACAU

cluster127 N=13 MPI=21.34 SCI=0.18

AGU__ A

UG_UG_UAUCUAUGAA

UAU

AUUCAUU

GAACCUC

AUUACU

UAG

CU_

_AG

C C A UC_G

CUA

GA

UGUGA

_GAAGGAUC

CAUGGGUA

CUAAUCUAAA

AAAAUAAAU

A_A

AU

AU

AUACAUUA

GU

CU

UA

GC

GU

alidot.ps

A G U _ _ A U G _ U G _ U A U C U A U G A A U A U A U U C A U U G A A C C U C A U U A C U U A G C U _ _ A G C C A U C _ G C U A G A U G U G A _ G A A G G A U C C A U G G G U A C U A A U C U A A A A A A A U A A A U A _ A

A G U _ _ A U G _ U G _ U A U C U A U G A A U A U A U U C A U U G A A C C U C A U U A C U U A G C U _ _ A G C C A U C _ G C U A G A U G U G A _ G A A G G A U C C A U G G G U A C U A A U C U A A A A A A A U A A A U A _ AAG

U_

_A

UG

_U

G_

UA

UC

UA

UG

AA

UA

UA

UU

CA

UU

GA

AC

CU

CA

UU

AC

UU

AG

CU

__

AG

CC

AU

C_

GC

UA

GA

UG

UG

A_

GA

AG

GA

UC

CA

UG

GG

UA

CU

AA

UC

UA

AA

AA

AA

UA

AA

UA

_A

AG

U_

_A

UG

_U

G_

UA

UC

UA

UG

AA

UA

UA

UU

CA

UU

GA

AC

CU

CA

UU

AC

UU

AG

CU

__

AG

CC

AU

C_

GC

UA

GA

UG

UG

A_

GA

AG

GA

UC

CA

UG

GG

UA

CU

AA

UC

UA

AA

AA

AA

UA

AA

UA

_A

cluster144 N=4 MPI=28.11 SCI=0.87

alidot.ps

C U A A A U U _ U U G U U U U A U U _ _ U U _ A G U U U U C C C U G A A A A U U G _ U G A U U C A U U U A A U G G C C C U C A C U C A A U U G A U U G U C U C A U C _ _ A C A A U _ C G G G A _ A U G A _ _ U U _ G G U U G U A A A G U A A A A G G U C U U G G A

C U A A A U U _ U U G U U U U A U U _ _ U U _ A G U U U U C C C U G A A A A U U G _ U G A U U C A U U U A A U G G C C C U C A C U C A A U U G A U U G U C U C A U C _ _ A C A A U _ C G G G A _ A U G A _ _ U U _ G G U U G U A A A G U A A A A G G U C U U G G ACU

AA

AU

U_

UU

GU

UU

UA

UU

__

UU

_A

GU

UU

UC

CC

UG

AA

AA

UU

G_

UG

AU

UC

AU

UU

AA

UG

GC

CC

UC

AC

UC

AA

UU

GA

UU

GU

CU

CA

UC

__

AC

AA

U_

CG

GG

A_

AU

GA

__

UU

_G

GU

UG

UA

AA

GU

AA

AA

GG

UC

UU

GG

A

CU

AA

AU

U_

UU

GU

UU

UA

UU

__

UU

_A

GU

UU

UC

CC

UG

AA

AA

UU

G_

UG

AU

UC

AU

UU

AA

UG

GC

CC

UC

AC

UC

AA

UU

GA

UU

GU

CU

CA

UC

__

AC

AA

U_

CG

GG

A_

AU

GA

__

UU

_G

GU

UG

UA

AA

GU

AA

AA

GG

UC

UU

GG

A

CUAA

A UU

_UUGUUUUAUU

__

UU

_AGUUUUCCCUGA

AAAUUG

_U

GAU

UCA

UUUA

AU

GGC

CC

U C ACU

CA

AU

UGA

UUGUC

UCA

UC _

_A

CAAU

_CGGGA_AUGA__UU

_GG

UU

GUAAAGUAA

AAG G

UCU

UGGA

GU

AUUG

AU

UAGU

GU

AUACGCGCGU

AUAU

CGUA

GC

cluster115 N=9 MPI=42.30 SCI=0.71alidot.ps

U G U A A G G _ A U G G G _ _ G U U _ C C A G U G _ U U U U G G C U A A C G G _ A A U U A C _ A U G U G _ U _ U G U A A U A C _ A U G A A A _ _ _ U U C A G _ U A G U _ C A G _ _ A U A U U G _ U U A C C _ C U U _ U A C U _ U G U A C U

U G U A A G G _ A U G G G _ _ G U U _ C C A G U G _ U U U U G G C U A A C G G _ A A U U A C _ A U G U G _ U _ U G U A A U A C _ A U G A A A _ _ _ U U C A G _ U A G U _ C A G _ _ A U A U U G _ U U A C C _ C U U _ U A C U _ U G U A C UUG

UA

AG

G_

AU

GG

G_

_G

UU

_C

CA

GU

G_

UU

UU

GG

CU

AA

CG

G_

AA

UU

AC

_A

UG

UG

_U

_U

GU

AA

UA

C_

AU

GA

AA

__

_U

UC

AG

_U

AG

U_

CA

G_

_A

UA

UU

G_

UU

AC

C_

CU

U_

UA

CU

_U

GU

AC

U

UG

UA

AG

G_

AU

GG

G_

_G

UU

_C

CA

GU

G_

UU

UU

GG

CU

AA

CG

G_

AA

UU

AC

_A

UG

UG

_U

_U

GU

AA

UA

C_

AU

GA

AA

__

_U

UC

AG

_U

AG

U_

CA

G_

_A

UA

UU

G_

UU

AC

C_

CU

U_

UA

CU

_U

GU

AC

U

UGUAAGG

_A

UGGG

__

GUU

_C

CAGUG

_UU

UUG

GC

UAACGG_AAU

UA

C_A

U GUG_

U_

U G UAA

UAC _ A

UGAAA

___U

UCAG_UAG

U_

CAG _

_A

UAUUG _

UU

AC

C_

CUU

_U

ACU_UGUAC

U

AUUG

UA

CG

UUUGCG

CG

UGAUUG

GUAUCG

AUUA_AGC

CGUA

CGAU

cluster134 N=8 MPI=22.71 SCI=0.39alidot.ps

_ A G U U G A C C _ _ _ A A _ U A U A A C U _ _ C G _ G _ U A _ G G G U U C G C _ A G C _ C A U G C C A G _ G G U U U A U C A _ C C A A G G _ A A C A U G G C U G C G A A G _ _ C C A _ G C C G G G _ A A A C A A U A G G U C C _ G _ A U U U

_ A G U U G A C C _ _ _ A A _ U A U A A C U _ _ C G _ G _ U A _ G G G U U C G C _ A G C _ C A U G C C A G _ G G U U U A U C A _ C C A A G G _ A A C A U G G C U G C G A A G _ _ C C A _ G C C G G G _ A A A C A A U A G G U C C _ G _ A U U U_A

GU

UG

AC

C_

__

AA

_U

AU

AA

CU

__

CG

_G

_U

A_

GG

GU

UC

GC

_A

GC

_C

AU

GC

CA

G_

GG

UU

UA

UC

A_

CC

AA

GG

_A

AC

AU

GG

CU

GC

GA

AG

__

CC

A_

GC

CG

GG

_A

AA

CA

AU

AG

GU

CC

_G

_A

UU

U

_A

GU

UG

AC

C_

__

AA

_U

AU

AA

CU

__

CG

_G

_U

A_

GG

GU

UC

GC

_A

GC

_C

AU

GC

CA

G_

GG

UU

UA

UC

A_

CC

AA

GG

_A

AC

AU

GG

CU

GC

GA

AG

__

CC

A_

GC

CG

GG

_A

AA

CA

AU

AG

GU

CC

_G

_A

UU

U

_AGUUG

ACC__

_A A

_UA

UAACU

__

CG_G_U

A_ G G G

UUCGC_AGC_CAUGC

CA

G_GG

UU

U A U C A _C

CAAG

G_

AACA

UGGCUGCGAAG _ _

CC

A_GC

CG

GG_A

AA

CAAU A

GGUCC

_G_A

UUUU_

UAA_

CA

AA

UA

UA

cluster139 N=6 MPI=25.90 SCI=0.34

0.1

cluster152

cluster144

cluster139

cluster134

cluster127

cluster115

cluster107

mir−7 candidate

mir−126 candidate

let−7

mir−124−b

mir−124−a

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 17 / 30

Page 46: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

Hand made secondary structures

17Eddy (1992)18Mosig et al. (2006)19Macke et al. (2001)

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 18 / 30

Page 47: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

Hand made secondary structures

rnabob 17, Fragrep 18, RNAmotif 19, Vienna RNA Package

17Eddy (1992)18Mosig et al. (2006)19Macke et al. (2001)

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 18 / 30

Page 48: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

Hand made secondary structures

rnabob 17, Fragrep 18, RNAmotif 19, Vienna RNA Package

Tetrapoda CCCTCCCGAAGCTGCGC----------GCTCGG-TCGTeleostei CCCTCCCGAAGCYCRGC----------GCTCGG-TGGMustelus CCCTCCCGAAGCTCAGC----------GCTCGG-TCGLampetra CCCTCCCGATGCTCTGC----------GCTCGG-TGGMyxine CCTCGCCGATGCCCCGC----------GCTCGGATCGBranchiostoma CTCTCCCGACGCCTCGC----------GCTCGG-TCGCiona intest. ---TCCCGATGCTTGCG---------CGCTCGG-TTGCiona savignyi ----CCCGATGCCATGC----------GCTCGG-TCGSaccoglossus CTCTCCCGATGCTTAGC----------GCTCGG-TCGLottia -TCTCCCGCTGCCTCGTC---------GCACGG-TAGHelix ---TCCCGCTGCACCCCCGGGGA---CGCACGG-TCGAplysia AGCTCTCGATGCACTGGCGGGTC----GCACGG-TCGCapitella AGGCGCCGATGCACCCGTCGAGGGCCCGCTCGG-CCGHelobdella GCAACGGCATGCACTTCCACCTGTC--GCTGGC-CAGSTRUCTURE -----<<<<-<<--------------->>>>>>----

Mammalia TCCAAATGAGGCGCTGC-ATGTG-GCAGTCTGCCTTTCTTTGallus TCCAAGTGAGGCACTGC-ATGGG-GCAGTCTGCCATTGTTTAnolis TCCAAGTCAGGCGCTGC-ACGGG-GCAGTCTGCCATTCTTTXenopus TCCAAGTGTGGCGCTGC-ATGTG-GCAGTGTGCCTTTCTTTOryzias TCCAACTGCGGCGCTGC-ACGTG-GCAGTCTGCCTTCCTTTGasterosteus TCCAAATGAGGCGCTGC-ACGTG-GCAGTCTGCCTTCCTTTFugu TCCAATTGCGGCGCTGC-ACGTG-GCAGTCTGCCTTACTTTTetraodon TCCAATTGCGGCGCTGC-ACGTG-GCAGTCTGCCTTCCTTTDanio TCCAAATGAGGCACTGC-ATGTG-GCAGTCTGCCTTTCTTTGadus TCCAAATGAGGCGCTGC-ACGTG-GCAGTCTGCCGTAATTTMustelus TCCAAGTCAGGCACTGC-ACGTG-GCAGTCTGCCGTTCTTTLampetra TCCAGATC-GGCACTGC-ACGTG-GCAGTCTGCCTGT-TTTPetromyzon TCCAGATC-GGCGCTGC-ACGTG-GCAGTTCGCCTGT-TTTMyxine TCCAAC-ACGGCGCTGC-ACGTG-GCAGTTTGCCTT--GTTCiona_int TCCATA-TAGGCACTGC-ACGGG-GCAGTATGCCTTCATTTCiona_sav TCCATA-TAGGCACTGC-ACGGG-GCAGTATGCCTTCATTTBranchiostoma_l TCCAAT-ACGGCGCTGCCACGCGGGCAGCCTGCCAT---TTBranchiostoma_f TCCAAT-ACGGCGCTGCCACGCAGGCGGCCTGCCATT-TTTSaccoglossus TCCATC-ATGGCGCTGCCTTG-GGGTAGCTTGCCTTCACTTLottia TCCAAT-ACGGCACTAC-AAGTG-GTAGTTTGCCTTCCTTTHelix TCCATTGGAGGCATTAC-ACGTG-GTAATCTGCCTTTCTTTCapitella TCCACA-CTGGCACCGC-ATGTG-GTGGTATGCCATTGTTTSTRUCTURE ---------<<<<<<<<<----->>>>>>-->>>-------

UG

AGGC

GCUGCc

AC

GUG

gGCAGU C

UGCCU

UUCU

UU3’STEM

basalDeuterostomes

Lophotrochozoa

Vertebrate

5’STEM

1 300100 200−100

STEM A

STEM B (Vertebrata only)

5’STEM STEM A STEM B 3’STEMVertebrata

basal Deuterostomes

Lophotrochozoa

PSE

PSE

PSE

TATA

TATA

TATA

(a)

(b)

(c)

(d)

(e)

17Eddy (1992)18Mosig et al. (2006)19Macke et al. (2001)

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 18 / 30

Page 49: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

Hand made secondary structures

rnabob 17, Fragrep 18, RNAmotif 19, Vienna RNA Package

Tetrapoda CCCTCCCGAAGCTGCGC----------GCTCGG-TCGTeleostei CCCTCCCGAAGCYCRGC----------GCTCGG-TGGMustelus CCCTCCCGAAGCTCAGC----------GCTCGG-TCGLampetra CCCTCCCGATGCTCTGC----------GCTCGG-TGGMyxine CCTCGCCGATGCCCCGC----------GCTCGGATCGBranchiostoma CTCTCCCGACGCCTCGC----------GCTCGG-TCGCiona intest. ---TCCCGATGCTTGCG---------CGCTCGG-TTGCiona savignyi ----CCCGATGCCATGC----------GCTCGG-TCGSaccoglossus CTCTCCCGATGCTTAGC----------GCTCGG-TCGLottia -TCTCCCGCTGCCTCGTC---------GCACGG-TAGHelix ---TCCCGCTGCACCCCCGGGGA---CGCACGG-TCGAplysia AGCTCTCGATGCACTGGCGGGTC----GCACGG-TCGCapitella AGGCGCCGATGCACCCGTCGAGGGCCCGCTCGG-CCGHelobdella GCAACGGCATGCACTTCCACCTGTC--GCTGGC-CAGSTRUCTURE -----<<<<-<<--------------->>>>>>----

Mammalia TCCAAATGAGGCGCTGC-ATGTG-GCAGTCTGCCTTTCTTTGallus TCCAAGTGAGGCACTGC-ATGGG-GCAGTCTGCCATTGTTTAnolis TCCAAGTCAGGCGCTGC-ACGGG-GCAGTCTGCCATTCTTTXenopus TCCAAGTGTGGCGCTGC-ATGTG-GCAGTGTGCCTTTCTTTOryzias TCCAACTGCGGCGCTGC-ACGTG-GCAGTCTGCCTTCCTTTGasterosteus TCCAAATGAGGCGCTGC-ACGTG-GCAGTCTGCCTTCCTTTFugu TCCAATTGCGGCGCTGC-ACGTG-GCAGTCTGCCTTACTTTTetraodon TCCAATTGCGGCGCTGC-ACGTG-GCAGTCTGCCTTCCTTTDanio TCCAAATGAGGCACTGC-ATGTG-GCAGTCTGCCTTTCTTTGadus TCCAAATGAGGCGCTGC-ACGTG-GCAGTCTGCCGTAATTTMustelus TCCAAGTCAGGCACTGC-ACGTG-GCAGTCTGCCGTTCTTTLampetra TCCAGATC-GGCACTGC-ACGTG-GCAGTCTGCCTGT-TTTPetromyzon TCCAGATC-GGCGCTGC-ACGTG-GCAGTTCGCCTGT-TTTMyxine TCCAAC-ACGGCGCTGC-ACGTG-GCAGTTTGCCTT--GTTCiona_int TCCATA-TAGGCACTGC-ACGGG-GCAGTATGCCTTCATTTCiona_sav TCCATA-TAGGCACTGC-ACGGG-GCAGTATGCCTTCATTTBranchiostoma_l TCCAAT-ACGGCGCTGCCACGCGGGCAGCCTGCCAT---TTBranchiostoma_f TCCAAT-ACGGCGCTGCCACGCAGGCGGCCTGCCATT-TTTSaccoglossus TCCATC-ATGGCGCTGCCTTG-GGGTAGCTTGCCTTCACTTLottia TCCAAT-ACGGCACTAC-AAGTG-GTAGTTTGCCTTCCTTTHelix TCCATTGGAGGCATTAC-ACGTG-GTAATCTGCCTTTCTTTCapitella TCCACA-CTGGCACCGC-ATGTG-GTGGTATGCCATTGTTTSTRUCTURE ---------<<<<<<<<<----->>>>>>-->>>-------

UG

AGGC

GCUGCc

AC

GUG

gGCAGU C

UGCCU

UUCU

UU3’STEM

basalDeuterostomes

Lophotrochozoa

Vertebrate

5’STEM

1 300100 200−100

STEM A

STEM B (Vertebrata only)

5’STEM STEM A STEM B 3’STEMVertebrata

basal Deuterostomes

Lophotrochozoa

PSE

PSE

PSE

TATA

TATA

TATA

(a)

(b)

(c)

(d)

(e)

Structure information necessary

17Eddy (1992)18Mosig et al. (2006)19Macke et al. (2001)

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 18 / 30

Page 50: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

7SK RNA

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 19 / 30

Page 51: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

7SK RNA

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 19 / 30

Page 52: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

7SK RNA

M1

M2c

M2b

M5

M4 M6

M2a

M8

M7

expansion domains

M3

Meta.Ins.Deut.Vert.Element

oldnew

M1

M2b

M2c

M3

M4

M5

M6

M7

M8 6

5

3

1

M2a

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 20 / 30

Page 53: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

7SK RNA

M1

M2c

M2b

M5

M4 M6

M2a

M8

M7

expansion domains

M3

Meta.Ins.Deut.Vert.Element

oldnew

M1

M2b

M2c

M3

M4

M5

M6

M7

M8 6

5

3

1

M2a

ACC

UU

AUC

CUAGU

CGG

GCC

A CUG

GG

UAGUU

G

UGG

CCGA

AGC

UGCG

CGC

UCGG

GUCCC

CUC

CGU

CG C

GAA

CGA

GGGAU

UCCGU

CU

CAUGU

GGCAG

AGGC

GCUG

GGA

UGUGAGG

CG

GA

GG

UCU

GA

CUGC

CAUCUGUC

ACC

CUG

GCU

AGGCG

CUGUG

CCCU

UC

CUCCCU

C

ACC

GCU

CCAUGUGCGU

C GG

CCUC G

AGGA

AUAGCCCCUAC

AC

CGAGGA

GAAGCU

ACCGGUCUUCGGUCAAGGGUAUACGAGUA

G U

U

C GA

CA

AC

AG

AA

UCC

GGAA

AA UC GAA

UU

G UCU

AC

C CU UC GA GA

UUCCAA AG CUCCAGACACAUCCAAA

UG UU

CUUU

2

3

4

G

5

6

1

U

C G

U

GGU

CCAU

UGAUC

CUAGU

CGG

GCC

A CUG

GG AGUU

CCGA

AGC

UGCG

CGC

UCGG

AGGC CG U CG

GCAGCU

UGCCU

AG

UG

U

AGCUGC

CU CC CCUGC

CU GAAC

GAUGGGAUG

GGAUGUG C

ACAUCCAAA GU

UUCUUU

AGGGC

GCU

G

GU

A

CU

GCGACAUC

C

CUGUGGG

U

C

AGGU

GUCAC

C

CGGUC

CCUUCCUCCCUCAC

CU

C

UAGAGGAGG

ACCGG

UC

UCGG

CU

U

AAG

UGAG

GG

CAUA

U

A

AGC

CCC

U

A G A C C U C A G A A C UC

ACC

UG

M1

M2a

M3

M5

M6

M7

M8

M2c

M2b

M4UGC

GUCC

GCAGGA

GAU

CAGC

AC

G

U

UAGA

ACCU

CAAG

AGGA

CCAAA

U UUUG

CAA

CCAUGG

GCUCUC

AA

CU

CG

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 20 / 30

Page 54: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

The 7SK-Automaton

1 4

5

3

8 9 10

6 7

GAUC

GAUC

GAUC

GAUC

GAUC

"M5"

"M5"

polyUd=7−30nt

d=7−30nt

d=7−30nt

GAUCd=1−6nt ||d>30nt

GAUCd=1−6nt ||d>30nt

GAUCd=1−6nt ||d>30nt

2

d>30ntd=1−6nt ||GAUC

d=7−30ntGAUC

"M5"

polyU

"M5"

polyU

GAUC

polyU

7SK?

no 7SK

GAUC

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 21 / 30

Page 55: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

7SK RNA in Caenorhabditis

M2b

M2c

GAU

ACGU

C

ACUGAAU U

UCGG

GCGA

UGAUC

G

GUUG

A

A

G CACUU

U

M1

AU

A

M3

UGAA

UU

GUGAUU

A

AUC

M5

U

GGGU

UAAC

UCUC

UA

GCACGGC

GA

UGGG C

C

G

U

AA

AU

CGCA

A

GA

C

UC

UUA G C C C G

M2a

CC GA UUUUU

G

CG

A

Ce Hs nt

1 2

- 163

- 143

- 116

- U1

- U4

- U5

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 22 / 30

Page 56: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

RNA:RNA interaction

snoRNA Targets: snoplexa,snoScanb, SnoGPSc

aTafer et al. (in prep)bLowe & Eddy (1999)cSchattner et al. (2005)

miRNA Targets: PicTara,RNAhybridb, miRandac, ...

aKrek et al. (2005)bRehmsmeier et al. (2004)cBetel et al. (2008)

20Muckstein et al (2006)21Hofacker (2003)

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 23 / 30

Page 57: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

RNA:RNA interaction

snoRNA Targets: snoplexa,snoScanb, SnoGPSc

aTafer et al. (in prep)bLowe & Eddy (1999)cSchattner et al. (2005)

miRNA Targets: PicTara,RNAhybridb, miRandac, ...

aKrek et al. (2005)bRehmsmeier et al. (2004)cBetel et al. (2008)

Generally: RNAup 20, RNAduplex 21, RNAcofold 21, RIP

C

AGUUUGCGCAG

UGGCAGUAU

CG

UAGC

CAAUGA

G

G

G

U

U

G

U

C

U

U

A

C

U

G

C

C

CG

U

A

U

G

C

G

G

C

G

G

C

C

A

G

G

A

C

U

A

U

C

A

A

U

U

U

A

G

U

C

A

U

C

A

U

A

A

U

A

U

A

G

A

A

U

A

U

A

G

A

G

C

A

U

A

U

C

U

GUA

C

U

C

A

C

C

A

A

A

G

U

A

A

CC

G

CC

A

G

A

C

G

C

A

A

U

U

U

G

A

A

G

C

C

G

A

A

U

C

GG

U

C

U

C

G

C

A

C

A

U

A

G

U

U

A

G

U

C

A

A

G

A

U

G

C

G

G

A

G

U

C

G

A

A

U

C

U

A

G

C

G

G

C

C

A

A

A

A

U

A

U

U

U

U

U

C

U

G

G

U

A

G

C

A

A

A

G

G

U

C

C

G

UU

C UUC

A

C

C

A

G

U

G

A

A

U

G

U

A

U

C

U

UUG

U6

U4

20Muckstein et al (2006)21Hofacker (2003)

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 23 / 30

Page 58: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

ncRNA challenges in silico

Target prediction

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 24 / 30

Page 59: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

ncRNA challenges in silico

Target prediction

Secondary structure prediction of highly divergent ncRNAs

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 24 / 30

Page 60: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

ncRNA challenges in silico

Target prediction

Secondary structure prediction of highly divergent ncRNAs

Pseudoknot prediction

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 24 / 30

Page 61: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

ncRNA challenges in silico

Target prediction

Secondary structure prediction of highly divergent ncRNAs

Pseudoknot prediction

3-dimensional structure prediction

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 24 / 30

Page 62: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

ncRNA challenges in silico

Target prediction

Secondary structure prediction of highly divergent ncRNAs

Pseudoknot prediction

3-dimensional structure prediction

RNA:Protein interaction

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 24 / 30

Page 63: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

Target Prediction: U7 RNA

processing of 3’ end of histones

smallest RNA polymersae-II transcipt known to-date: 57-70nt

one stem only, many highly conserved sequences (Sm, HDE-rev-comp)

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 25 / 30

Page 64: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

Target Prediction: U7 RNA

processing of 3’ end of histones

smallest RNA polymersae-II transcipt known to-date: 57-70nt

one stem only, many highly conserved sequences (Sm, HDE-rev-comp)

TCCCGG

AGGGCC

T TTT A TT C A A T TCGT

TTC

TAAT

TG

CA

GGG

GTTA

TT

TT

T G A A

CA A

GT

A CG C A A A

TT T

T

33ntC T A A A G A C T G A T

CT T T C T A T T T A3’

5’

5’

3’

Sm

HDE

Stem loop

Histone H3

U7

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 25 / 30

Page 65: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

Target Prediction: SL-Smy System in Nematodes?

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 26 / 30

Page 66: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

Complex Secondary Structure Prediction: 7SK RNA

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 27 / 30

Page 67: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

Complex Secondary Structure Prediction: 7SK RNA

Homo/1331 ............................CGTCC.CTC.CCGAAGC....................TGC.................GCGC.TCGGTCG...................................................................................................Mus/1331/133 ............................CGTCC.CTC.CCGAAGC....................TGC.................GCGC.TCGGTCG...................................................................................................Anolis/1341 ............................CGTCC.CTC.CCGAAGC....................TGC.................GCGC.TCGGTGG...................................................................................................Xenopus/1330 ............................CGTCC.CTC.CCGAAGC....................TGC.................GCGC.TCGGTCG...................................................................................................Danio/1300/1 ............................TGTCC.CTC.CCGAAGC....................TCC.................GCGC.TCGGTGG...................................................................................................B_lanceolatu ..............................GGCTCTC.CCGACGC....................CTC.................GCGC.TCGGTC....................................................................................................Ciona_intes ...............................GTTCTC.CCGATGC...................TTGC.................GCGC.TCGGTT....................................................................................................Culex/1329 .............TCTGGTATCA.CGGGTGA..ACTC.TCGCTGC.ACGGC..........GCCGG............GCCGA..ACGCA.CGATT....................................................................................................Nasonia/1304 .........................GTGCTC.GGCTC.CCGATGC..GCCT.........ACAAACCG..........AGGC...CTGTCTCG.TG....................................................................................................Pediculus/12 .....................AGTTCA.GGGAC.CTC.CCGATGC...............TACAAAT...................CGCA.CGGTG....................................................................................................Capitella/12 .........................CATTGCAAGGCG.CCGATGC..ACCC..........GTCGA............GGG...CCCGC.TCGGCCGC..................................................................................................Platynereis ..........................GTCTGTCCCTC.CCGTTGC................CTCAGC...................CGCA.CGGTC....................................................................................................Myxine/1300 ..........................TC.CGGCC.TCGCCGATGC................CCCG.....................CGC.TCGGATC...................................................................................................Lottia/1277 ............................GGGT.C.TC.CCGCTGC................CTCGT....................CGCA.CGGTA....................................................................................................Helix/1303 ...........................AGTTGAGCTC.CCGCTGC..ACCC...........CCG.............GGG....ACGCA.CGGTC....................................................................................................Mytilus_gall ............................ATGGAACTC.CCGCTGC.................CTTGT...................CGCA.CGGTT....................................................................................................Helobdella ........................GCACTTCCACCTG.TCGCTGGCCAGCAGCAGCAACAAGAACCTGTTCCACGACCCCTCCGACAGCAGCGG......................................................................................................Petrolisthes .......................CTCTTGC.GGGCTC.CCGCTGC.................CTTGC...................CGCA.CGGT.....................................................................................................dmoj_scaffol TGGCATTGATGTGGCAAC.ACGTTC.TGATTGGCTTT.CCGCTGCCTTT.GCTAA.CGACGACGG....GTCGATTAG.CAACAGACGCA.CGGTCATGCATCAGC.A.CCACCCACCGCCCAACCTCCGCCCCTCTCACGCGTATTTCAACCGCTTCTGGTTGAGGATGCGT.GTATAGGTAACGGGTT.GGGCGdmel_3R_3300 TGGCGTTGCCGTGGCT.CCTCGTT.CGGATCGGCTTT.CCGCTGCCTTCCACTGGATGACGACGG....GTTATCCGGCGGTC.GACGCA.CGGTCATGCACCCCCGATCCGTC....GCCCCCACCACCCC........GCGGATTCTGGT.......CTCG.ACCGGAAGCCGTATTGGG..CGGGGACGGGCG#=GC SS_cons (((((((((((((((((((((((((((((((((((((.(((((((.((((((((((((((.........))))))))))))))....))))))))...((((..(((((.........((((.((((.(((.........((((.((((((((..........)))))))).))))...)))...))))..))))))))).......)))).#=GC SS_cons |----------------M4-----------------|.|-----------------------M5------------------------------|...|--------------------------------------------------M5drosohophila-expansion-------------------------------------|.

Homo/1331 AAGAGGACG..............ACCATCCCCG.ATAGAGGA................GGACCGGTCT......TCGGTC............AAGGGTATACGAGTAGCTGCGCTCCCCTGCT.AGAACCTCCAAACAAGCT....CTCAA..GGTCCATTTGTAGGAG.AACGTAGGGTAGTCAAGCT.......Mus/1331/133 AAGAGGACG..............ACCTTCCCCGAATAGAGGA................GGACCGGTCT......TCGGTC............AAGGGTATACGAGTAGCTGCGCTCCCCTGCT.AGAACCTCCAAACAAGCT....CTCAA..GGTCCA.TTGTAGGAG.AACGTAGGGTAGTCAAGCT.......Anolis/1341 AAGAGGACG..............ACGTCCCAGGTATAGAAGGAGTGT.........accgaggtctcca.....gTCTTCGGT........CCCGGGTATACGA.TAGCTGCGCTCCCCTGCT.AGAACCTCCAAACAAGCT......CAA..GGTCCATTTGTAGGAG.AACGTAGGGTAGTCAAGCT.......Xenopus/1330 AAGAGGATG..............GC.TGTCCCCGGTAGAGAAGC................ACCGATCT......TCGGTC............AAGGGTATACGAGTAGCTGCGCTCCCCTGCT.AGAACCTCCAAACAAGCT.....CCAA.GGCCCCA.TTGTAGGAGAGACGTAGGGTAGTCAAGCT.......Danio/1300/1 AAGAGGACG..............AGtttCCC.........................CCGGCGG..ACacGAGCA..TCGCTGG..............TATAGAAGTAGCTGCGCTCCCCTGCT.AGAACCTCCAAACAAGCT......CAA.GGCaaCATTTGTAGGCGAAACGTAGGGAAGTCGAGCT.......B_lanceolatu GAGAGTC...................................TACCT..CCTCCCCG.AGTCA.ACCCCC....TGTGATTGCCGAAAGGTTGGGTGAAAAGCGTAGCTGCAGCCC...CTGATGTTCTCCACTGC............TAG........GGTCA.GAGAGCGTCGTGTCGAGC.GCAGC.......Ciona_intes GAGAAC.................GAGAATGAACCCCCTC...................................................................GGATGCTCG.CGTGGA.TTAGAGATTAAAGTAGGAGT....AACTCGCCCCCACTT.AATTCT....TCCCCTTCGG.CATCT....CAGCulex/1329 GATG..TCATTCG.TGATACAAGA..CGCTGCCCAG................ACCCAACTATTT.CTCA......AAATTGTTGAGT..............ATATCGTAAT.TTAATACAGATAGC....................TTA....................GCT.TCGG.ATTAAAATTAC.......Nasonia/1304 .GCCCCTGGCAC............................................CTGTAGGCCCGCAC....GGTC.G.AG..........TCTTCACGTCGCTCCTCGAACT.ACC...GCGATT.TCC...........AAATTGGG............GGGCAATCGAATAGGTCAGA.CGAGG.......Pediculus/12 GATGGTCCCGAGGACT..........................................................................................CCTCGATTGCC..GCGATT.....................CCA.....................AATTGTTAGGCG..TGAGG.......Capitella/12 CGCCTT.CAATG................CACACAT.............GGTTCCTTG..TGAGCCGATTG....GGTTTAAACAAG.AGCA.............AGGTAATTCTGGATTATTAGT...................TAAC......................GCTAATGGG.TAGGGTTAC......APlatynereis GAGGGA..GGC.................CAAATTCTA...........GCTTCATTAGCT..GCTCATG.....GGTATGGGTA.TAAAGT.............AGCCTAGCTTCTTAACT......................GACTTA.........................AG..GGGAAGTTGGG....GGAMyxine/1300 GACGGCCG....................AGAGGCTCA............CCGCACGCACCACGCTCAC......GGCTACG.GCGCACGG.......GTTTAACCACGGAGC.TGCGAGTACCCACTTA.GACCAAACCCCG...GAGA....CGGCG.ACGGCGATAAGAGGG...AAGCACGCTCTG.......Lottia/1277 GA.GGCTC......................................ATTCTAAATTGGT.CGCTCTCCC.....GAGTGCACCG...TAGGGT.......TTACATGTT....CACTGGTCCTGTCT................AAATTCAA.................AGGTAGGGTTA......TAAC......THelix/1303 GAGCTCTGTT..................CAA.............AATTGC..TGGTCGTAGAGTTTGCAT....GGCTCGGCGGCCAATGGGGTT..........ATTTCTG..TTAGGGCTTCTCT..CTAT...........TTCCGC............GTAAC.GGGGAAGTTCGTTTTCAGAAA.......Mytilus_gall GAGATCTGTTT.................AATT..........................TGGCCTCTCGTT....TAGGCCG.........................TGGGTATAAAAAGTTAACATTCGACTTCTAAAGTTTCTAG..................................................Helobdella CA..TGGGCCTGC...............CGAAA..................CTGGC.CTACC.....ACT........TCGCCGCCAG.................CTTGCTGGCTCACGGC........................CCAAC........................GCTGTGTGCCTTCG........Petrolisthes GAGCCCTGC.TGGG..............TTCCTCT.................CT.CGGGCTGTGGTTGT.....CCTCT.CCCGTAGCAT...............CTGTCCTCATGCTAGCCTTG....................GGTAA....................CAGGG..GGTGT..TGATA.......dmoj_scaffol GAAGCCAA....CAACA..GTTGCCCAAGT.CAGCCATTTTC................................................................AAAATTTCTTGGTTAAGTAAC...................TTT...................GTAGCTTAGCTT.CGGATTTTCGTAATAdmel_3R_3300 GAAGCCGG....CGAC.AG.TTGCCCGAGT.CAGCCACTTTC................................................................AAAATTTGTTGGTTAAGTAAC...................TTA...................GTAGCTTAGCTT.CGGATTTTCGTAACA#=GC SS_cons .)))))))))))))))))))))))))))))))))))).....(((((((((((((((((((((((.........))))))))))))))))))))))).........(((((((((((((((((((((((((((((((((((((.........)))))))))))))))))))))))))))))))))))))......................#=GC SS_cons .|-----------------M4’--------------|.....|--------------------------M6-------------------------|.........|----------------------------------------M7---------------------------------------|......................

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 27 / 30

Page 68: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

Pseudoknot: Telomerase RNAReplication of chromosomal ends

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 28 / 30

Page 69: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

Pseudoknot: Telomerase RNAReplication of chromosomal ends

Leads to cancer

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 28 / 30

Page 70: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

Pseudoknot: Telomerase RNAReplication of chromosomal ends

Leads to cancer

Telomerase Enzym: Telomerase RNA andTERT

RNA part: highly variable: 100 nt – 2 000 nt

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 28 / 30

Page 71: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

Pseudoknot: Telomerase RNAReplication of chromosomal ends

Leads to cancer

Telomerase Enzym: Telomerase RNA andTERT

RNA part: highly variable: 100 nt – 2 000 nt

CS4CS2

CS3

CS1

CS5a

CS5

CS7

Ku80

TBtemplate

S1

S2

CS6

S3

pseudoknot

template

IIIa

IIIb

IV

I

TBII

Yeast

Ciliate

Vertebrate

P5

P6b CR5P6.1

CR4

pseudoknot

TB

CAB

snoRNAH ACA

template

pseudoknot

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 28 / 30

Page 72: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

Specific Telomerase RNA Pseudoknot Finder

Organism Genome Size Obtained(bp) Frequency

S. purpuratus 809 952 877 170 820C. intestinalis 141 233 565 22 330C. savignyi 255 955 828 82 776N. crassa 1 860 657 949 342 708N. discreta 556 883 022 183 461N. tetrasperma 487 800 222 133 339

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 29 / 30

Page 73: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

Specific Telomerase RNA Pseudoknot Finder

B

A

C

xxxxxx

xxxxxx

xxxxxx

............ ..................

s1 s2 s3 s4 s5

s1 s2 s3 s4 s5

s1 s2 s3 s4 s5 s6G

s8

s6G

s8

s6G

s8

xxxxxxxxxxxxxxxx

xxxxxxxxxxxx

...................

<<<<< xxxxxxxxx>>>>>

>>>>>

>>>>>

<<<<<

<<<<<

<< >>>

............

s1V

s2

G*s7

TT

TT

TTTC

AA

AAA

s5

s8

s3

s4

s6

A

B

5’

3’

Organism Genome Size Obtained(bp) Frequency

S. purpuratus 809 952 877 170 820C. intestinalis 141 233 565 22 330C. savignyi 255 955 828 82 776N. crassa 1 860 657 949 342 708N. discreta 556 883 022 183 461N. tetrasperma 487 800 222 133 339

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 29 / 30

Page 74: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

Specific Telomerase RNA Pseudoknot Finder

B

A

C

xxxxxx

xxxxxx

xxxxxx

............ ..................

s1 s2 s3 s4 s5

s1 s2 s3 s4 s5

s1 s2 s3 s4 s5 s6G

s8

s6G

s8

s6G

s8

xxxxxxxxxxxxxxxx

xxxxxxxxxxxx

...................

<<<<< xxxxxxxxx>>>>>

>>>>>

>>>>>

<<<<<

<<<<<

<< >>>

............

s1V

s2

G*s7

TT

TT

TTTC

AA

AAA

s5

s8

s3

s4

s6

A

B

5’

3’

Organism Genome Size Obtained(bp) Frequency

S. purpuratus 809 952 877 170 820C. intestinalis 141 233 565 22 330C. savignyi 255 955 828 82 776N. crassa 1 860 657 949 342 708N. discreta 556 883 022 183 461N. tetrasperma 487 800 222 133 339

M =

0

B

B

B

B

B

B

B

B

B

@

t

c

a

mferel

hom

rdrc

haca

1

C

C

C

C

C

C

C

C

C

A

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 29 / 30

Page 75: Use and Complexity of existing RNA-toolsmanja/talks/0911tianjin.pdf · Use and Complexity of existing RNA-tools M. Marz University of Leipzig Tianjin, China 09.11.2009 M. Marz (University

Acknowledgements

Thx 2:Christian Reidys

Qin JingPeter Stadler

and the whole bioinformatics group leipzig

Thank You!

M. Marz (University of Leipzig) Use and Complexity of existing RNA-tools Tianjin, China 09.11.2009 30 / 30