RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export...

47
RNA structure prediction

Transcript of RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export...

Page 1: RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.

RNA structure prediction

Page 2: RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.

RNA functions

• RNA functions as– mRNA– rRNA– tRNA– Nuclear export– Spliceosome– Regulatory molecules (RNAi)– Enzymes– Virus– Retrotransposons– Medicine

Page 3: RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.

Base pairs

• C-G stronger than U-A• Non-standard G-U

CC

C

N

N C

O

C

CC

C

O

N C

O

N

cytosine

Uracyl

N

CC

C

NC

NC

N

N

O

N

CC

C

NC

NC

N

N

Adenine

Guanine

PYRIMIDINES PURINES

H donor acceptor

Page 4: RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.

• Base-pairs are usually coplanar

• are almost always stacked

• stems – continuous stacks

• 3D structure of a stack is a helix

hairpin

Stacking

Page 5: RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.

Predictable structures

Page 6: RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.

Hard-to-predict structures

• Pseudoknots, kissing hairpins, hairpin-bulge

Page 7: RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.

Secondary structure notations

Page 8: RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.

Tertiary structure

Page 9: RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.
Page 10: RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.
Page 11: RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.

RNAi

Page 12: RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.
Page 13: RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.
Page 14: RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.

Structure representation

Page 15: RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.
Page 16: RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.
Page 17: RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.

Main approaches to RNA secondary structure prediction

• Energy minimization – dynamic programming approach– does not require prior sequence alignment– require estimation of energy terms

contributing to secondary structure

• Comparative sequence analysis– Using sequence alignment to find conserved

residues and covariant base pairs.– most trusted

Page 18: RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.

Dotplot

Page 19: RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.

Think!

• Make a dotplot of an RNA molecule– Sequence : GGGAAAUCC

• What is the secondary structure?

Page 20: RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.

Dynamic programming approach

• Nussinov algorithm

Page 21: RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.

Dynamic programming approach

a) i,j is paired E(i,j) = E(i+1,j-1) + (ri,rj)b) i is unpaired E(i,j) = E(i+1,j) c) j is unpaired E(i,j) = E(i,j-1)d) bifurcation E(i,j) = E(i,k)+E(k+1,j)

i+1 j-1 i+1 j ji j-1i j i

ik k+1

a) b) c) d)

Let E(i,j) = minimum energy for subchain starting at i and ending at j(ri,rj) = energy of pair ri, rj (rj = base at position j)

Page 22: RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.

RNA secondary structure algorithm

• Given: RNA sequence x1,x2,x3,x4,x5,x6,…,xL

• Initialization: E(i, i-1) = 0 for i = 2 to LE(i, i) = 0 for i = 1 to L

Recursion: for n = 2 to L # iteration over length

E(i,j) = min { E(i+1, j), E(i, j-1), E(i+1, j-1)+ (ri,rj) ,

min i<k<j {E(i,k)+E(k+1, j)} }

• Cost: O(n3)

Page 23: RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.

ExampleLet (ri,rj) = -1 if ri,rj form a base pair and 0 otherwise Input : GGAAAUCC

G G A A A U C C

G 0

G 0 0

A 0 0

A 0 0

A 0 0

U 0 0

C 0 0

C 0 0

E(i,j) = lowestenergy conformation for subchain from i to j

i

j

Here we should have min energy for AAAUC

Page 24: RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.

Example-continued

G G A A A U C C

G 0 0

G 0 0 0

A 0 0 0

A 0 0 0

A 0 0 -1

U 0 0 0

C 0 0 0

C 0 0

GGA (i=2, j=3)min { 0,

0,0+ (GA)

} = 0

AAU (i=5, j=6)min { 0,

0,0+ (AU)

} = -1

-1

0

i

j

Page 25: RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.

Recovering the structure from the DP table

• Complexity O(n3)• Main difference to sequence alignment – we are

tracing back a tree-like structure not a single optimal path (bifurcation introduces branch points).

• Method 1: Leave pointers as you compute the table: for each element of the table store (at most two) pointers to the subsequences used in the solution.

• Method 2: Recover history based on numerical values in the table.– Stacking – check value along diagonal– Bifurcation - find k such that E(i,k)+E(k+1,j) =

E(i,j)

Page 26: RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.
Page 27: RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.
Page 28: RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.
Page 29: RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.

More realistic energy function

Page 30: RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.

Stacking energies

Page 31: RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.

Even more realistic energy function

Loops have destabilizing effect structure (d) should have lower energy that (b).

Destabilizing contribution of loops should depend on the loop length (k).

Stacking has additional stabilizing contribution .

(k) (k)

(k)

Page 32: RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.

More realistic energy function requires slightly more involved

recurrenceE(i,j) = min{ E(i+1,j), E(i,j-1), min{E(i,k)+E(k+1,j), L(i,j)} where

L(i,j) = {(ri,rj) + (j-i-1) if L(i,j) is a hairpin loop;

(ri,rj) + ij-1if hairpin

mink{(ri,rj) + (k)+E(i+k+1,j-1)} if i-bulge

mink {(ri,rj) + (k)+E(i+1,j-k-1)} if j-bulge

mink1,k2{(ri,rj) + (k1+k2)+E(i+k1+1,j-k2-1)} if internal loop

}Extra “min” gives O(n4) algorithm

Page 33: RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.
Page 34: RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.

Covariance method

In a correct multiple alignment RNAs, conserved base pairs are often revealed by the presence of frequent correlated compensatory mutations.

Two boxed positions are covarying to maintain Watson-Crick complementary. This covariation implies a base pair which may then be extended in both directions.

GCCUUCGGGCGACUUCGGUCGGCUUCGGCC

Page 35: RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.

Alignment

Page 36: RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.
Page 37: RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.

Quantities measure of pairwise sequence covariation

Mutual information Mij between two aligned columns i, j

Mij = i,j fxixj log2 (fxixj/fxi fxj)Where fxixj frequency of the pair (observed)

fxi frequency of nucleotide xi at position iObservations:

0 <= Mij <=2

i,j uncorrelated Mij = 0

Page 38: RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.

MI: examples

A

A

C

G

U

U

G

C

fAi = .5fCi = .25fGi = .25fUj = .5fCj = .25fGj = .25

fAU = .5fCG = .25fGC = .25

Mij = xixj fxixj log2 (fxixj/fxi fxj) =

.5 log2 (.5/(.5*.5))+2*.25 log2 (.25/(.25*.25))=

.5 *1 +.5*2 = 1.5A

A

A

A

U

U

U

UMij = 1 log 1 = 0

U

A

C

G

A

U

G

C

Mij = 4*.25 log 4 = 2

i j

Page 39: RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.

Other methods

• HMMs• Stochastic context free grammars

Page 40: RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.

Conclusion

• RNA secondary structure prediction– Single sequence:

• Dot-plot• Nussinov dynamic programming• Energy function

– Covariance analysis• Mutual information• Hidden Markov Models • SCFGs

Page 41: RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.
Page 42: RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.
Page 43: RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.
Page 44: RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.
Page 45: RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.

Finding “most probable structure”

• S – structure then, E(S) free energy of S p(S) = exp(-E(S)/kT)/Q Q = x exp(-E(x)/kT) ) partition function• Problem: computing Q• Method to compute Q – dynamic

programming (similar as presented before but scores are replaced with probabilities and min energy with sum of probabilities).

Page 46: RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.

tRNA