RNA multiple sequence alignment Craig L. Zirbel zirbel@bgsu.edu October 14, 2010.

RNA multiple sequence alignment

Craig L. Zirbelzirbel@bgsu.eduOctober 14, 2010

RNA primary sequences Laboratory techniques make it possible to

extract specific RNA molecules and determine the sequence of nucleotides. Here are the (unaligned) sequences of the 5S ribosomal RNA molecule from different organisms:

UUAGGCGGCCACAGCGGUGGGGUUGCCUCCCGUACCCAUCCCGAACACGGAAGAUAAGCCCACCAGCGUUCCGGGGAGUACUGGAGUGCGCGAGCCUCUGGGAAACCCGGUUCGCCGCCACC A H.m. (structure)GCCUGGCGGCCGUAGCGCGGUGGUCCCACCUGACCCCAUGCCGAACUCAGAAGUGAAACGCCGUAGCGCCGAUGGUAGUGUGGGGUCUCCCCAUGCGAGAGUAGGGAACUGCCAGGC B E.coli (structure)UCCCCCGUGCCCAUAGCGGCGUGGAACCACCCGUUCCCAUUCCGAACACGGAAGUGAAACGCGCCAGCGCCGAUGGUACUGGGCGGGCGACCGCCUGGGAGAGUAGGUCGGUGCGGGG B T.th. (structure)AGUGGUGGCCAUAUCGGCGGGGUUCCUCCCCGUACCCAUCCUGAACACGGAAGAUAAGCCCGCCAGCGUCCGGCAAGUACUGGAGUGCGCGAGCCUCUGGGAAAUCCGGUUCGCCGCCAC A L27170.1/1-120GUAGCGGCCACAGCGGUGGGGUUCCUCCCGUACCCAUCCCGAACACGGAAGAUAAGCCCACCAGCGUUCCGGGGAGUACUGGAGUGCGCGACCCUCUGGGAAACCGGGUUCGCCGCUAC A L27163.1/1-119GCGGCCAGGGCGGAGGGGAAACACCCGUACCCAUUCCGAACACGGAAGUGAAGCCCUCCAGCGAACCAGCUAGUACUAGAGUGGGAGACCCUCUGGGAGCGCUGGUUCGCCGCC A L27343.1/3-116UUUGGCGGUCAUGGCGUGGGGGUUUAUACCUGAUCUCGUUUCGAUCUCAGUAGUUAAGUCCUGCUGCGUUGUGGGUGUGUACUGCGGUUUUUUGCUGUGGGAAGCCCACUUCACUGCCAGAC A M36187.1/5-126GUUGGCGGUCAUGGCGUGGGGUUUAUACCUGAUCUCGUUUCGAUCUCAGUAGUUAAGUCCUGCUGCGUUGUGGGUGUGUACUGCGGUUUUUUGCUGUGGGAAGCCCACUUCACUGCCAGAC A X62857.1/1-121UUUGGCGGUCAUGGCGUGGGGGUUAUACCUGAUCUCGUUUCGAUCUCAGUAGUUAAGUCCUGCUGCGUUGUGGGUGUGUACUGCGGUGUUUUGCUGUGGGAAGCCCAUUUCACUGCCAGCC A X15364.1/6601-6721GUCGGUGGUGUUAGCGGUGGGGUCACGCCCGGUCCCUUUCCGAACCCGGAAGCUAAGCCUGCCUGCGCCGAUGGUACUGCACCUGGGAGGGUGUGGGAGAGUAGGACCCCGCCGGCA B M16176.1/4-120GUCGGUGGUUAUAGCGGUGGGGUCACGCCCGGUCCCAUUCCGAACCCGGAAGCUAAGCCCACCUGCGCCGAUGGUACUGCACCUGGGAGGGUGUGGGAGAGUAGGUCACCGCCGGCC B M16177.1/4-120GUUGGUGGUUAUUGUGUCGGGGGUACGCCCGGUCCCUUUCCGAACCCGGAAGCUAAGCCCGAUUGCGCUGAUGGUACUGCACCUGGGAGGGUGUGGGAGAGUAGGUCGCUGCCAACC B X55255.1/4-120UACGGCGGUCAAUAGCGGCAGGGAAACGCCCGGUCCCAUCCCGAACCCGGAAGCUAAGCCUGCCAGCGCCAAUGAUACUGCCCUCACCGGGUGGAAAAGUAGGACACCGCCGAAC B X55259.1/3-117UACGGCGGUCCAUAGCGGCAGGGAAACGCCCGGUCCCAUCCCGAACCCGGAAGCUAAGCCUGCCAGCGCCGAUGAUACUACCCAUCCGGGUGGAAAAGUAGGACACCGCCGAAC B X55251.1/3-116UACGGCGGCCACAGCGGCAGGGAAACGCCCGGUCCCAUUCCGAACCCGGAAGCUAAGCCUGCCAGCGCCGAUGAUACUGCCCCUCCGGGUGGAAAAGUAGGACACCGCCGAAC B X75601.1/91-203UAAGGCGGCCAUAGCGGUGGGGUUACUCCCGUACCCAUCCCGAACACGGAAGAUAAGCCCGCCUGCGUUCCGGUCAGUACUGGAGUGCGCGAGCCUCUGGGAAAUCCGGUUCGCCGCCUACU A X03407.1/5927-6048UUGGCGACCAUAGCGGCGAGUGACCUCCCGUACCCAUCCCGAACACGGAAGAUAAGCUCGCCUGCGUUUCGGUCAGUACUGGAUUGGGCGACCCUCUGGGAAAUCUGAUUCGCCGCCACC A L27168.1/1-120GGCGGCCAGAGCGGUGAGGUUCCACCCGUACCCAUCCCGAACACGGAAGUUAAGCUCACCUGCGUUCUGGUCAGUACUGGAGUGAGCGAUCCUCUGGGAAAUCCAGUUCGCCGCCC A X02128.1/24-139GGGCGGCCAGAGCGGUGAGGUUCCACCCGUACCCAUCCCGAACACGGAAGUUAAGCUCGCCUGCGUUCUGGUCAGUACUGGAGUGAGCGAUCCUCUGGGAAAUCCAGUUCGCCGCCCCU A X14441.1/5-123

Watson-Crick basepairs Watson-Crick basepairs can substitute for one another

freely without changing the structure of the RNA molecule. They are said to be isosteric, and changes between these basepairs is an example of neutral variability. They are held together by hydrogen bonds (dotted lines).

Superposition

RNA sequence variability

To preserve RNA helices, compensating mutations must be made; to replace a GC basepair with an AU basepair, two letters must change in distant regions of the sequence; see below. Statistically, this is called “long-range dependence.”

Compensating mutations such as this do not change the secondary or tertiary structure of the molecule.

UGCCUGGCGACCGUAGCGCGGUGGUCCCACCUGACCCCAUGCCGAACUCAGAAGUGAAACGCCGUAGCGCCGAUGGUAGUGUGGGGUCUCCCCAUGCGAGAGUAGGGAAUUGCCAGGCAU

UGCCUGGCGGCCGUAGCGCGGUGGUCCCACCUGACCCCAUGCCGAACUCAGAAGUGAAACGCCGUAGCGCCGAUGGUAGUGUGGGGUCUCCCCAUGCGAGAGUAGGGAACUGCCAGGCAU

Comparative sequence analysis By manually aligning similar RNA sequences and noting

the pairs of columns where mainly AU, CG, GC, and UA pairs occur, one can infer the secondary structure of an RNA molecule.

• This is the inferred secondary structure of the 5S RNA, with bases labeled as found in E. coli. There are five helical regions, with three “internal loops” and two “hairpin loops” separating them. Note the colors!

Fox & Woese 1975; Peattie et al. 1981; Noller 1984; Cannone et al. 2002; http://www.rna.ccbb.utexas.edu

UGCCUGGCGGCCGUAGCGCGGUGGUCCCACCUGACCCCAUGCCGAACUCAGAAGUGAAACGCCGUAGCGCCGAUGGUAGUGUGGGGUCUCCCCAUGCGAGAGUAGGGAACUGCCAGGCAU

RNA 3D structure• Starting late in the year

2000, high-resolution atomic structures of entire ribosomes have been published. These show the bases, the backbone, the Watson-Crick basepairs, and several new types of basepairs.

E. coli 5S

The 2009 Nobel Prize in Chemistry went to Yonath, Ramakrishnan, and Steitz for their work on x-ray crystal structures of ribosomes.

Three 5S rRNA 3D structures

Haloarcula marismortui E. coli Thermus thermophilus

RNA multiple sequence alignment The same RNA in different organism can

be presumed to have the same, or roughly the same, secondary and 3D structure.

Compensating changes far apart in the sequence make it hard to use multiple sequence alignment tools that were developed for proteins.

Two situations for RNA multiple sequence alignment1. We have two or more sequences from the

same RNA, but don’t know their common secondary structure or 3D structure

2. We have RNA sequences and a common secondary structure or even a single 3D structure which we can assume they all share to some degree

10-14-2010

RNA MSARNA Multiple Sequence AlignmentSlides by Anton Petrov, Ph.D. student, BGSU

Why DNA and protein alignment methods don’t work for RNA

RNA sequences may look dissimilar but still fold into the same structure.

Gorodkin et al., 2010. Trends in biotechnology

Example

RNA-specific alignment methods

FOLDALIGN http://foldalign.ku.dk/index.html

MAFFT http://mafft.cbrc.jp/alignment/server/

LocARNA http://rna.informatik.uni-freiburg.de:8080/LocARNA.jsp

R-Coffee http://tcoffee.vital-it.ch/cgi-bin/Tcoffee/tcoffee_cgi/index.cgi?stage1=1&daction=RCOFFEE::Regular

and many others...

RNA MSA and ncRNA discovery Conservation is a reliable indicator of biological

importance. If an RNA fragment is conserved across multiple

species, it may function as ncRNA. ncRNA discovery programs scan multiple genomic

sequences in order to detect putative ncRNA candidates.

MSA is an essential part of the ncRNA discovery pipeline.

RNA MSA and ncRNA discovery

Multiple sequence alignment

ncRNA discovery

Secondary structure prediction

Align first

Fold first

Align and fold simultaneously

RNAz Once you have a good MSA, you can use

tools like RNAz to scan your alignment for conserved stable secondary structures, which may function as ncRNAs.

http://rna.tbi.univie.ac.at/cgi-bin/RNAz.cgi

RNA multiple sequence alignment Craig L. Zirbel zirbel@bgsu.edu October 14, 2010.

Documents

Transcript of RNA multiple sequence alignment Craig L. Zirbel zirbel@bgsu.edu October 14, 2010.

Benasque RNA 2012: RNA Motifs

Embedding Fonts Acrobat X Pro For PC - bgsu.edu

2011 RNA-RNA and RNA-protein interactions in coronavirus replication and transcription

TRANSCRIPTION RNA is transcribed from a DNA template. DNA RNA polymerase RNA transcript RNA PROCESSING In eukaryotes, the RNA transcript (pre- mRNA) is.

Materials - bgsu.edu

Arts - bgsu.edu

RNA metabolism - WordPress.com 05, 2017 · RNA metabolism • DNA dependent synthesis of RNA • RNA processing • RNA dependent synthesis of RNA …

1 NOTES: Chapter 13 - RNA & Protein Synthesis Vocabulary: Messenger RNA (mRNA) Ribosomal RNA (rRNA) Transfer RNA (tRNA) Transcription RNA Polymerase Codon.

Does it ever feel like your students don’t read your · 16G FCS Building & by appointment 206 FCS Building & by appointment 419.372.6461 laraf@bgsu.edu mludy@bgsu.edu Course Description:

Jeremy Wallach CV - bgsu.edu

Lynn, Jackson, Shultz & Lebrun,j snoozy lynn capp wesley j snoozy united states of america james zirbel bryan dragt james a tae ck er james zirbel united states of america michael

Eukaryotic RNA processing: alternative RNA splicing

Violin I - bgsu.edu

RNA polymerase #1 General properties E. coli RNA polymerase Eukaryotic RNA polymerases.

SCORE - bgsu.edu

Turbo-Decoding of RNA Secondary Structure€¦ · RNA Structure Analysis: Motivation and Background RNA, noncoding RNA, RNA structure and its signiﬁcance RNA structure prediction

RNA and Protein Synthesis · Web viewamino acid, anticodon, codon, gene, messenger RNA, nucleotide, ribosome, RNA, RNA polymerase, transcription, transfer RNA, translation Prior Knowledge

Learning to Think About Gravity Misconceptions Claudine Kavanagh Esther Zirbel Tufts University.

Inhibition of RNA binding to hepatitis C virus RNA-dependent RNA ...

RNA PROCESSING AND RNPs. RNA Processing Very few RNA molecules are transcribed directly into the final mature RNA. Most newly transcribed RNA molecules.