Searching for functional regions (coding or non-coding) in mammalian genomes
Long non-coding RNAsdominic/stuff/bled11.pdf · Long non-coding RNAs Dominic Rose Bioinformatics...
Transcript of Long non-coding RNAsdominic/stuff/bled11.pdf · Long non-coding RNAs Dominic Rose Bioinformatics...
![Page 1: Long non-coding RNAsdominic/stuff/bled11.pdf · Long non-coding RNAs Dominic Rose Bioinformatics Group, University of Freiburg Bled, Feb. 2011. Outline De novo prediction oflong non-coding](https://reader034.fdocuments.us/reader034/viewer/2022042318/5f0770e87e708231d41cff63/html5/thumbnails/1.jpg)
Long non-coding RNAs
Dominic RoseBioinformatics Group, University of Freiburg
Bled, Feb. 2011
![Page 2: Long non-coding RNAsdominic/stuff/bled11.pdf · Long non-coding RNAs Dominic Rose Bioinformatics Group, University of Freiburg Bled, Feb. 2011. Outline De novo prediction oflong non-coding](https://reader034.fdocuments.us/reader034/viewer/2022042318/5f0770e87e708231d41cff63/html5/thumbnails/2.jpg)
![Page 3: Long non-coding RNAsdominic/stuff/bled11.pdf · Long non-coding RNAs Dominic Rose Bioinformatics Group, University of Freiburg Bled, Feb. 2011. Outline De novo prediction oflong non-coding](https://reader034.fdocuments.us/reader034/viewer/2022042318/5f0770e87e708231d41cff63/html5/thumbnails/3.jpg)
Outline
De novo prediction of long non-coding RNAs (lncRNAs)Genome-wide RNA gene-findingIntrinsic properties (sequence/structure) of lncRNAs?
Paperpile
![Page 4: Long non-coding RNAsdominic/stuff/bled11.pdf · Long non-coding RNAs Dominic Rose Bioinformatics Group, University of Freiburg Bled, Feb. 2011. Outline De novo prediction oflong non-coding](https://reader034.fdocuments.us/reader034/viewer/2022042318/5f0770e87e708231d41cff63/html5/thumbnails/4.jpg)
Long ncRNAs: introduction
ENCODE: pervasive transcription in eukaryotes,large portion are lncRNAs.lncRNAs =̂ ncRNAs > 200ntCapped, polyadenylated, often (alternatively) spliced (just likeprotein-coding genes), but lack discernible open reading framesGene regulators: Evf-2, Xist, roX1, roX2, H19, . . .Precursor for small RNAs: miRNAs, snoRNAs, . . .Imprinting, epigenetics, disease-associated(expression correlates with viral insertion, carcinogenesis, . . . )Functionally important ncRNA classNo general computational method for their detection
Pap
erpi
le
![Page 5: Long non-coding RNAsdominic/stuff/bled11.pdf · Long non-coding RNAs Dominic Rose Bioinformatics Group, University of Freiburg Bled, Feb. 2011. Outline De novo prediction oflong non-coding](https://reader034.fdocuments.us/reader034/viewer/2022042318/5f0770e87e708231d41cff63/html5/thumbnails/5.jpg)
Non-coding RNA gene-finding
Eukaryotic genome organizationTaft et al., J Pathol, 2010
Challenging problem: heterogeneity, lack of featuresShort (structured) vs. long (unstructured?) ncRNAs
RNA secondary structure predictionSplice site detectionPromoter recognition. . .
Pap
erpi
le
![Page 6: Long non-coding RNAsdominic/stuff/bled11.pdf · Long non-coding RNAs Dominic Rose Bioinformatics Group, University of Freiburg Bled, Feb. 2011. Outline De novo prediction oflong non-coding](https://reader034.fdocuments.us/reader034/viewer/2022042318/5f0770e87e708231d41cff63/html5/thumbnails/6.jpg)
Long ncRNAs: a first (generic) gene-finding approach
Paperpile
![Page 7: Long non-coding RNAsdominic/stuff/bled11.pdf · Long non-coding RNAs Dominic Rose Bioinformatics Group, University of Freiburg Bled, Feb. 2011. Outline De novo prediction oflong non-coding](https://reader034.fdocuments.us/reader034/viewer/2022042318/5f0770e87e708231d41cff63/html5/thumbnails/7.jpg)
“Evolutionary signatures” of splice sites?
Novel method to capture compositional features of splice sites:evaluate substitution patternsderive log-odds substitution scores
Paperpile
![Page 8: Long non-coding RNAsdominic/stuff/bled11.pdf · Long non-coding RNAs Dominic Rose Bioinformatics Group, University of Freiburg Bled, Feb. 2011. Outline De novo prediction oflong non-coding](https://reader034.fdocuments.us/reader034/viewer/2022042318/5f0770e87e708231d41cff63/html5/thumbnails/8.jpg)
Long ncRNAs: experimental evidence
B: RT-PCR + sequencing confirms 10 SS, 8/9 predicted SS are true
Pap
erpi
le
![Page 9: Long non-coding RNAsdominic/stuff/bled11.pdf · Long non-coding RNAs Dominic Rose Bioinformatics Group, University of Freiburg Bled, Feb. 2011. Outline De novo prediction oflong non-coding](https://reader034.fdocuments.us/reader034/viewer/2022042318/5f0770e87e708231d41cff63/html5/thumbnails/9.jpg)
Long ncRNAs: further properties?
Recall: splice sites help to pin down lncRNAsAre there other features specific to lncRNAs?Do lncRNAs exhibit specific sequential or structural motifs?If so, are they conserved among species?We need sequence alignments . . .(or at least a set of orthologous lncRNAs)
Pap
erpi
le
![Page 10: Long non-coding RNAsdominic/stuff/bled11.pdf · Long non-coding RNAs Dominic Rose Bioinformatics Group, University of Freiburg Bled, Feb. 2011. Outline De novo prediction oflong non-coding](https://reader034.fdocuments.us/reader034/viewer/2022042318/5f0770e87e708231d41cff63/html5/thumbnails/10.jpg)
Long ncRNAs: orthologs
Use existing genome-wide alignments?(UCSC 46-way, Ensembl Compara)Maybe, but for “large regions of low sequence similarity”(=lncRNA) these automatic pipelines have serious issues
Often very fragmented, short blocks are reportedBroken synteny . . .
Paperpile
![Page 11: Long non-coding RNAsdominic/stuff/bled11.pdf · Long non-coding RNAs Dominic Rose Bioinformatics Group, University of Freiburg Bled, Feb. 2011. Outline De novo prediction oflong non-coding](https://reader034.fdocuments.us/reader034/viewer/2022042318/5f0770e87e708231d41cff63/html5/thumbnails/11.jpg)
Long ncRNAs: orthologs? problematic!
Paperpile
![Page 12: Long non-coding RNAsdominic/stuff/bled11.pdf · Long non-coding RNAs Dominic Rose Bioinformatics Group, University of Freiburg Bled, Feb. 2011. Outline De novo prediction oflong non-coding](https://reader034.fdocuments.us/reader034/viewer/2022042318/5f0770e87e708231d41cff63/html5/thumbnails/12.jpg)
Long ncRNAs: example EGO-B
EGO = eosinophil granule ontogenyEssential for the expression of major basic proteins in eosinophils(white blood cells, part of the immune system)Experimentally confirmed two-exon transcript structureLet’s try to collect orthologs . . .
(human)
Pap
erpi
le
![Page 13: Long non-coding RNAsdominic/stuff/bled11.pdf · Long non-coding RNAs Dominic Rose Bioinformatics Group, University of Freiburg Bled, Feb. 2011. Outline De novo prediction oflong non-coding](https://reader034.fdocuments.us/reader034/viewer/2022042318/5f0770e87e708231d41cff63/html5/thumbnails/13.jpg)
Long ncRNAs: example EGO-BBlack bars: hand curated annotation of EGO-B.Manual inspection reveals orthologs.Given annotation (here RefSeq) often wrong.
(horse)
(elefant)
Pap
erpi
le
![Page 14: Long non-coding RNAsdominic/stuff/bled11.pdf · Long non-coding RNAs Dominic Rose Bioinformatics Group, University of Freiburg Bled, Feb. 2011. Outline De novo prediction oflong non-coding](https://reader034.fdocuments.us/reader034/viewer/2022042318/5f0770e87e708231d41cff63/html5/thumbnails/14.jpg)
Long ncRNAs: example EGO-B
Paperpile
![Page 15: Long non-coding RNAsdominic/stuff/bled11.pdf · Long non-coding RNAs Dominic Rose Bioinformatics Group, University of Freiburg Bled, Feb. 2011. Outline De novo prediction oflong non-coding](https://reader034.fdocuments.us/reader034/viewer/2022042318/5f0770e87e708231d41cff63/html5/thumbnails/15.jpg)
Long ncRNAs: example EGO-B
Paperpile
![Page 16: Long non-coding RNAsdominic/stuff/bled11.pdf · Long non-coding RNAs Dominic Rose Bioinformatics Group, University of Freiburg Bled, Feb. 2011. Outline De novo prediction oflong non-coding](https://reader034.fdocuments.us/reader034/viewer/2022042318/5f0770e87e708231d41cff63/html5/thumbnails/16.jpg)
Long ncRNAs: local structure motifs?
Now use full spectrum of bioinformatic approachese.g. search for local secondary structure motifs
Pap
erpi
le
![Page 17: Long non-coding RNAsdominic/stuff/bled11.pdf · Long non-coding RNAs Dominic Rose Bioinformatics Group, University of Freiburg Bled, Feb. 2011. Outline De novo prediction oflong non-coding](https://reader034.fdocuments.us/reader034/viewer/2022042318/5f0770e87e708231d41cff63/html5/thumbnails/17.jpg)
Long ncRNAs: local structure motifs?
Now use full spectrum of bioinformatic approachese.g. search for local secondary structure motifs
Pap
erpi
le
![Page 18: Long non-coding RNAsdominic/stuff/bled11.pdf · Long non-coding RNAs Dominic Rose Bioinformatics Group, University of Freiburg Bled, Feb. 2011. Outline De novo prediction oflong non-coding](https://reader034.fdocuments.us/reader034/viewer/2022042318/5f0770e87e708231d41cff63/html5/thumbnails/18.jpg)
Long ncRNAs: local structure motifs?
Now use full spectrum of bioinformatic approachese.g. search for local secondary structure motifs
Pap
erpi
le
![Page 19: Long non-coding RNAsdominic/stuff/bled11.pdf · Long non-coding RNAs Dominic Rose Bioinformatics Group, University of Freiburg Bled, Feb. 2011. Outline De novo prediction oflong non-coding](https://reader034.fdocuments.us/reader034/viewer/2022042318/5f0770e87e708231d41cff63/html5/thumbnails/19.jpg)
RNAmotid: accuracy-based detection of local RNAelements
RNAmotid = RNA motif identificationSteffen Heyne, Essam Abdel Moaty Abdel HadyIdentifies local RNA elements on a genome-wide scaleFast sparse algorithm to predict maximum expected accuracystructuresBased on base-pairing/unpairing probabilities (RNAplfold)Relies on a novel accuracy function reflecting localityAllows genome-wide scans for structured regions that have highprobabilities of containing significant local RNA motifs
Paperpile
![Page 20: Long non-coding RNAsdominic/stuff/bled11.pdf · Long non-coding RNAs Dominic Rose Bioinformatics Group, University of Freiburg Bled, Feb. 2011. Outline De novo prediction oflong non-coding](https://reader034.fdocuments.us/reader034/viewer/2022042318/5f0770e87e708231d41cff63/html5/thumbnails/20.jpg)
RNAmotid: MEA-foldingP
aper
pile
![Page 21: Long non-coding RNAsdominic/stuff/bled11.pdf · Long non-coding RNAs Dominic Rose Bioinformatics Group, University of Freiburg Bled, Feb. 2011. Outline De novo prediction oflong non-coding](https://reader034.fdocuments.us/reader034/viewer/2022042318/5f0770e87e708231d41cff63/html5/thumbnails/21.jpg)
RNAmotid: MEA-folding
Pap
erpi
le
![Page 22: Long non-coding RNAsdominic/stuff/bled11.pdf · Long non-coding RNAs Dominic Rose Bioinformatics Group, University of Freiburg Bled, Feb. 2011. Outline De novo prediction oflong non-coding](https://reader034.fdocuments.us/reader034/viewer/2022042318/5f0770e87e708231d41cff63/html5/thumbnails/22.jpg)
RNAmotid: MEA-foldingP
aper
pile
![Page 23: Long non-coding RNAsdominic/stuff/bled11.pdf · Long non-coding RNAs Dominic Rose Bioinformatics Group, University of Freiburg Bled, Feb. 2011. Outline De novo prediction oflong non-coding](https://reader034.fdocuments.us/reader034/viewer/2022042318/5f0770e87e708231d41cff63/html5/thumbnails/23.jpg)
Long ncRNAs: orthologs?
RNAmotid scans single sequencesMight be an option if lncRNA alignments are missingHowever, recall our nice EGO-B alignment . . .
Pap
erpi
le
![Page 24: Long non-coding RNAsdominic/stuff/bled11.pdf · Long non-coding RNAs Dominic Rose Bioinformatics Group, University of Freiburg Bled, Feb. 2011. Outline De novo prediction oflong non-coding](https://reader034.fdocuments.us/reader034/viewer/2022042318/5f0770e87e708231d41cff63/html5/thumbnails/24.jpg)
Long ncRNAs: orthologs?
RNAmotid scans single sequencesMight be an option if lncRNA alignments are missingHowever, recall our nice EGO-B alignment . . .
Pap
erpi
le
![Page 25: Long non-coding RNAsdominic/stuff/bled11.pdf · Long non-coding RNAs Dominic Rose Bioinformatics Group, University of Freiburg Bled, Feb. 2011. Outline De novo prediction oflong non-coding](https://reader034.fdocuments.us/reader034/viewer/2022042318/5f0770e87e708231d41cff63/html5/thumbnails/25.jpg)
Long ncRNAs: EGO-B
Pap
erpi
le
. . . which brings us full circle with regard to our initial topic,predicting novel transcripts via conserved splice sites.
![Page 26: Long non-coding RNAsdominic/stuff/bled11.pdf · Long non-coding RNAs Dominic Rose Bioinformatics Group, University of Freiburg Bled, Feb. 2011. Outline De novo prediction oflong non-coding](https://reader034.fdocuments.us/reader034/viewer/2022042318/5f0770e87e708231d41cff63/html5/thumbnails/26.jpg)
Long ncRNAs: EGO-B
Pap
erpi
le
. . . which brings us full circle with regard to our initial topic,predicting novel transcripts via conserved splice sites.
![Page 27: Long non-coding RNAsdominic/stuff/bled11.pdf · Long non-coding RNAs Dominic Rose Bioinformatics Group, University of Freiburg Bled, Feb. 2011. Outline De novo prediction oflong non-coding](https://reader034.fdocuments.us/reader034/viewer/2022042318/5f0770e87e708231d41cff63/html5/thumbnails/27.jpg)
Long ncRNAs: EGO-B
Pap
erpi
le
. . . which brings us full circle with regard to our initial topic,predicting novel transcripts via conserved splice sites.
![Page 28: Long non-coding RNAsdominic/stuff/bled11.pdf · Long non-coding RNAs Dominic Rose Bioinformatics Group, University of Freiburg Bled, Feb. 2011. Outline De novo prediction oflong non-coding](https://reader034.fdocuments.us/reader034/viewer/2022042318/5f0770e87e708231d41cff63/html5/thumbnails/28.jpg)
Long ncRNAs: EGO-B
Pap
erpi
le
. . . which brings us full circle with regard to our initial topic,predicting novel transcripts via conserved splice sites.
![Page 29: Long non-coding RNAsdominic/stuff/bled11.pdf · Long non-coding RNAs Dominic Rose Bioinformatics Group, University of Freiburg Bled, Feb. 2011. Outline De novo prediction oflong non-coding](https://reader034.fdocuments.us/reader034/viewer/2022042318/5f0770e87e708231d41cff63/html5/thumbnails/29.jpg)
Acknowledgements
Leipzig (Peter F. Stadler)↔ Freiburg (Rolf Backofen)Michael Hiller (Stanford University)
Paperpile
![Page 30: Long non-coding RNAsdominic/stuff/bled11.pdf · Long non-coding RNAs Dominic Rose Bioinformatics Group, University of Freiburg Bled, Feb. 2011. Outline De novo prediction oflong non-coding](https://reader034.fdocuments.us/reader034/viewer/2022042318/5f0770e87e708231d41cff63/html5/thumbnails/30.jpg)
Your state-of-the-art reference manager: Paperpile
http://paperpile.com/beta/https://github.com/wash/paperpile