RNA Structure Prediction and Comparison Session 4 … structure is often wrong ... package...
-
Upload
hoangthuan -
Category
Documents
-
view
219 -
download
2
Transcript of RNA Structure Prediction and Comparison Session 4 … structure is often wrong ... package...
RNAStructure SS
2009
RobertGiegerich
Motivation
Lost in FoldingSpace
Abstractioncomes to rescue
The idea ofabstractshapes
The general idea
Defining shapeabstractions
Properties of theshape space
Simple shapeanalysis
The toolRNAshapes
Completeprobabilisticshape analysis
ShapeProbabilitites
The RNAshapespackage
RNA Structure Prediction and ComparisonSession 4
Abstract Shape Analysis
Robert Giegerich
Faculty of TechnologyBielefeld University
Bielefeld, SS 2009
Robert Giegerich RNA Structure SS 2009
RNAStructure SS
2009
RobertGiegerich
Motivation
Lost in FoldingSpace
Abstractioncomes to rescue
The idea ofabstractshapes
The general idea
Defining shapeabstractions
Properties of theshape space
Simple shapeanalysis
The toolRNAshapes
Completeprobabilisticshape analysis
ShapeProbabilitites
The RNAshapespackage
1 MotivationLost in Folding SpaceAbstraction comes to rescue
2 The idea of abstract shapesThe general ideaDefining shape abstractionsProperties of the shape space
3 Simple shape analysisThe tool RNAshapes
4 Complete probabilistic shape analysisShape ProbabilititesThe RNAshapes package
Robert Giegerich RNA Structure SS 2009
RNAStructure SS
2009
RobertGiegerich
Motivation
Lost in FoldingSpace
Abstractioncomes to rescue
The idea ofabstractshapes
The general idea
Defining shapeabstractions
Properties of theshape space
Simple shapeanalysis
The toolRNAshapes
Completeprobabilisticshape analysis
ShapeProbabilitites
The RNAshapespackage
RNA Gene Prediction via Structure . . .
“Is this sequence an RNA gene?” ↔“Does it have a known functional structure?”
When sequence conservation is low or no homologs are known:STEP 1: MFE folding (Mfold, RNAfold, pknotsRG)STEP 2: Structure comparison against known functionalstructures
It is not that easy ...
Robert Giegerich RNA Structure SS 2009
RNAStructure SS
2009
RobertGiegerich
Motivation
Lost in FoldingSpace
Abstractioncomes to rescue
The idea ofabstractshapes
The general idea
Defining shapeabstractions
Properties of theshape space
Simple shapeanalysis
The toolRNAshapes
Completeprobabilisticshape analysis
ShapeProbabilitites
The RNAshapespackage
RNA Gene Prediction via Structure . . .
“Is this sequence an RNA gene?” ↔“Does it have a known functional structure?”
When sequence conservation is low or no homologs are known:STEP 1: MFE folding (Mfold, RNAfold, pknotsRG)STEP 2: Structure comparison against known functionalstructures
It is not that easy ...
Robert Giegerich RNA Structure SS 2009
RNAStructure SS
2009
RobertGiegerich
Motivation
Lost in FoldingSpace
Abstractioncomes to rescue
The idea ofabstractshapes
The general idea
Defining shapeabstractions
Properties of theshape space
Simple shapeanalysis
The toolRNAshapes
Completeprobabilisticshape analysis
ShapeProbabilitites
The RNAshapespackage
Accuracy of MFE folding . . .
adequacy of thermodynamic parameters . . . ?
interaction with other molecules . . . ?
RNA sequence processing . . . ?
folding kinetics (co-transcriptional folding) . . . ?
physical properties of the folding space . . . !
Robert Giegerich RNA Structure SS 2009
RNAStructure SS
2009
RobertGiegerich
Motivation
Lost in FoldingSpace
Abstractioncomes to rescue
The idea ofabstractshapes
The general idea
Defining shapeabstractions
Properties of theshape space
Simple shapeanalysis
The toolRNAshapes
Completeprobabilisticshape analysis
ShapeProbabilitites
The RNAshapespackage
Recent Mfold Evaluation by Gutell Lab
Doshi KJ, Cannone JJ, Cobaugh CW, Gutell RR.: Evaluation of the
suitability of free-energy minimization using nearest-neighbor energy
parameters for RNA secondary structure prediction. BMC Bioinformatics.
2004 Aug 5;5:105.
Compares MFE foldings to structures derived by comparativeanalysis and proven by experimental techniques.Findings:
base pair accuracy of about 20% - 71%
no improvement from recently updated thermodynamicparameters
note: did not check for good near-optimal solutions
Robert Giegerich RNA Structure SS 2009
RNAStructure SS
2009
RobertGiegerich
Motivation
Lost in FoldingSpace
Abstractioncomes to rescue
The idea ofabstractshapes
The general idea
Defining shapeabstractions
Properties of theshape space
Simple shapeanalysis
The toolRNAshapes
Completeprobabilisticshape analysis
ShapeProbabilitites
The RNAshapespackage
Lost in Folding Space (1)
The folding space of a given sequence is LARGE:
number of foldings is exponential in sequence length
number of near-optimal foldings is exponential in energywindow
Look at the 111 “best” structures for a tRNA (using the toolRNAmovies).
Robert Giegerich RNA Structure SS 2009
RNAStructure SS
2009
RobertGiegerich
Motivation
Lost in FoldingSpace
Abstractioncomes to rescue
The idea ofabstractshapes
The general idea
Defining shapeabstractions
Properties of theshape space
Simple shapeanalysis
The toolRNAshapes
Completeprobabilisticshape analysis
ShapeProbabilitites
The RNAshapespackage
Lost in Folding Space (2)
What we observe from RNAmovie:
LARGE number of close-to-optimal foldings
FEW structural classes holding many similar foldings
Can we reduce the folding space to the representatives of theseclasses?
Robert Giegerich RNA Structure SS 2009
RNAStructure SS
2009
RobertGiegerich
Motivation
Lost in FoldingSpace
Abstractioncomes to rescue
The idea ofabstractshapes
The general idea
Defining shapeabstractions
Properties of theshape space
Simple shapeanalysis
The toolRNAshapes
Completeprobabilisticshape analysis
ShapeProbabilitites
The RNAshapespackage
RNA structure prediction based on thermodynamics
Even with the best possible model parameters:
MFE structure is often wrong
Some near-optimal structure is always right
The number of near-optimals is exponential
Most are similar, but some quite distinct
C
U
GC
A
G
UA
G
G
U U GG
UC C
G
CG
C
G
U C
UG
CUG
CGG
U
GC
C G
G
A
AU
C
G
U
C
G
G
U
U
G
G
Multiple Loop
Stacking Region
Hairpin Loop
Internal Loop
Bulge Loop (left)
Bulge Loop (right)
C
C A
C
UGGC
GCC
G
CG
G
GC
C
G
A
CG
UC
G A
CU
A G
G CC
G
C
U
C
GGA
A
A
C
G
G
G
G
U
A
C
C
G
C
G
UU
C
CC
A
C
U
A
G
G
C
G
C
C
GG
Is there a shape LIKE this .............. or NOT like this.....?
Robert Giegerich RNA Structure SS 2009
RNAStructure SS
2009
RobertGiegerich
Motivation
Lost in FoldingSpace
Abstractioncomes to rescue
The idea ofabstractshapes
The general idea
Defining shapeabstractions
Properties of theshape space
Simple shapeanalysis
The toolRNAshapes
Completeprobabilisticshape analysis
ShapeProbabilitites
The RNAshapespackage
Formalizing the notion of (abstract) shape
Shape abstraction retains nesting and adjacency of stems
Shape abstraction disregards all sizes (of stems, loops, . . . )Shape abstraction may retain or disregard presence and type ofbulges and internal loops
Robert Giegerich RNA Structure SS 2009
RNAStructure SS
2009
RobertGiegerich
Motivation
Lost in FoldingSpace
Abstractioncomes to rescue
The idea ofabstractshapes
The general idea
Defining shapeabstractions
Properties of theshape space
Simple shapeanalysis
The toolRNAshapes
Completeprobabilisticshape analysis
ShapeProbabilitites
The RNAshapespackage
Formalizing the notion of (abstract) shape
Shape abstraction retains nesting and adjacency of stemsShape abstraction disregards all sizes (of stems, loops, . . . )
Shape abstraction may retain or disregard presence and type ofbulges and internal loops
Robert Giegerich RNA Structure SS 2009
RNAStructure SS
2009
RobertGiegerich
Motivation
Lost in FoldingSpace
Abstractioncomes to rescue
The idea ofabstractshapes
The general idea
Defining shapeabstractions
Properties of theshape space
Simple shapeanalysis
The toolRNAshapes
Completeprobabilisticshape analysis
ShapeProbabilitites
The RNAshapespackage
Formalizing the notion of (abstract) shape
Shape abstraction retains nesting and adjacency of stemsShape abstraction disregards all sizes (of stems, loops, . . . )Shape abstraction may retain or disregard presence and type ofbulges and internal loops
Robert Giegerich RNA Structure SS 2009
RNAStructure SS
2009
RobertGiegerich
Motivation
Lost in FoldingSpace
Abstractioncomes to rescue
The idea ofabstractshapes
The general idea
Defining shapeabstractions
Properties of theshape space
Simple shapeanalysis
The toolRNAshapes
Completeprobabilisticshape analysis
ShapeProbabilitites
The RNAshapespackage
Levels of abstraction
Level 0 Level 1
All types ofFull structure
loops
Level 3
All helix
Level 4
Multi− and
internal loops,
no bulges
Level 5
Stem
arrangement
only
interruptions
Robert Giegerich RNA Structure SS 2009
RNAStructure SS
2009
RobertGiegerich
Motivation
Lost in FoldingSpace
Abstractioncomes to rescue
The idea ofabstractshapes
The general idea
Defining shapeabstractions
Properties of theshape space
Simple shapeanalysis
The toolRNAshapes
Completeprobabilisticshape analysis
ShapeProbabilitites
The RNAshapespackage
Shape abstraction mathematics
General:
tree-like domains of structures F and shapes Ptree homomorphism π : F → P
For each sequence s:
folding space of sequence s: F (s)
shape space of sequence s: P(s) = π(F (s))
shape class of p in F (s):f (x , p) = {x |x ∈ F (S), π(x) = p}
shape representative structure:shrep = class member of minimal free energy, formally
shrep(s, p)
Robert Giegerich RNA Structure SS 2009
RNAStructure SS
2009
RobertGiegerich
Motivation
Lost in FoldingSpace
Abstractioncomes to rescue
The idea ofabstractshapes
The general idea
Defining shapeabstractions
Properties of theshape space
Simple shapeanalysis
The toolRNAshapes
Completeprobabilisticshape analysis
ShapeProbabilitites
The RNAshapespackage
Shape Abstraction – Informal
Strongest abstraction – no bulges
(((((...((((...))..))(((((...)))))...(((.((..))...))).)))))
[ [ [ ] _ ] [ ] [ _ [ ] _ ] ]
[ [ ] [ ] [ ] ]
Weaker abstraction – retaining bulges
Robert Giegerich RNA Structure SS 2009
RNAStructure SS
2009
RobertGiegerich
Motivation
Lost in FoldingSpace
Abstractioncomes to rescue
The idea ofabstractshapes
The general idea
Defining shapeabstractions
Properties of theshape space
Simple shapeanalysis
The toolRNAshapes
Completeprobabilisticshape analysis
ShapeProbabilitites
The RNAshapespackage
Shape trees and shape strings
Level 0 Level 3 Level 5
sr
sr
sr
ml
c
c
c
a a usr
srsr
c guuuu bl
auasr
g
g
g
sr
sr
c gccc
AD
CL CL
CL
CL CL
AD
CL
CLc g
c gc
g
g
c
((((.(((...)))((...(...))))))) [ [ ] [ [ ] ] ] [ [ ] [ ] ]
Robert Giegerich RNA Structure SS 2009
RNAStructure SS
2009
RobertGiegerich
Motivation
Lost in FoldingSpace
Abstractioncomes to rescue
The idea ofabstractshapes
The general idea
Defining shapeabstractions
Properties of theshape space
Simple shapeanalysis
The toolRNAshapes
Completeprobabilisticshape analysis
ShapeProbabilitites
The RNAshapespackage
Shape algorithmics
Implementation of shape analysis:
shape abstractions are tree homomorphisms
integrate well with DP algorithms
allows for a priori rather than a posteriori analysis
compute shapes in parallel with energyperform analyses on per-shape basis
Robert Giegerich RNA Structure SS 2009
RNAStructure SS
2009
RobertGiegerich
Motivation
Lost in FoldingSpace
Abstractioncomes to rescue
The idea ofabstractshapes
The general idea
Defining shapeabstractions
Properties of theshape space
Simple shapeanalysis
The toolRNAshapes
Completeprobabilisticshape analysis
ShapeProbabilitites
The RNAshapespackage
Properties of Shapes and shreps
shape classes are disjoint
shreps are interesting
shapes have sequence-independent representation
shapes are meaningful across different sequences (ofdifferent length)
shapes and shreps can be computed efficiently
Robert Giegerich RNA Structure SS 2009
RNAStructure SS
2009
RobertGiegerich
Motivation
Lost in FoldingSpace
Abstractioncomes to rescue
The idea ofabstractshapes
The general idea
Defining shapeabstractions
Properties of theshape space
Simple shapeanalysis
The toolRNAshapes
Completeprobabilisticshape analysis
ShapeProbabilitites
The RNAshapespackage
Simple shape Analysis with RNAshapes
The three top shreps of the aforementioned tRNA:
Shape GGGCCCAUAGCUCAGUGGUAGAGUGCCUCCUUUGCAAGGAGGAUGCCCUGGGUUCGAAUCCCAGUGGGUCCA
[] (((((((((((((((.((((.....(((((((...))))))).))))))))))).........)))))))). -35.9 kcal/mol[[][]] ((((((((.....((.((((.....(((((((...))))))).))))))(((.......))).)))))))). -32.2 kcal/mol
[[][][]] ((((((...((((.......)))).(((((((...))))))).....(((((.......))))).)))))). -31.7 kcal/mol
Robert Giegerich RNA Structure SS 2009
RNAStructure SS
2009
RobertGiegerich
Motivation
Lost in FoldingSpace
Abstractioncomes to rescue
The idea ofabstractshapes
The general idea
Defining shapeabstractions
Properties of theshape space
Simple shapeanalysis
The toolRNAshapes
Completeprobabilisticshape analysis
ShapeProbabilitites
The RNAshapespackage
GG
GG
AUG
UA
GC
UCA
GUG
GUAG
AGC
GC
AU
GC
UU C
GCAUGU A U
GA
GGCC C
CGGGUU C
GAUCCCC G
GC
AUCU
C
Robert Giegerich RNA Structure SS 2009
RNAStructure SS
2009
RobertGiegerich
Motivation
Lost in FoldingSpace
Abstractioncomes to rescue
The idea ofabstractshapes
The general idea
Defining shapeabstractions
Properties of theshape space
Simple shapeanalysis
The toolRNAshapes
Completeprobabilisticshape analysis
ShapeProbabilitites
The RNAshapespackage
GGGCCCAUAG
CUCA
GUGG
UAGAG
UGCCUCCUU
UG C
AAGGAGG
AUGCCCU
G G GU U
CG
AAUCCC
AGUGGGUCCA
Robert Giegerich RNA Structure SS 2009
RNAStructure SS
2009
RobertGiegerich
Motivation
Lost in FoldingSpace
Abstractioncomes to rescue
The idea ofabstractshapes
The general idea
Defining shapeabstractions
Properties of theshape space
Simple shapeanalysis
The toolRNAshapes
Completeprobabilisticshape analysis
ShapeProbabilitites
The RNAshapespackage
GGGCCCAUA
GCUCAGU
GG
U AG A G U
GCCUCCUU
UG C
AAGGAGGAUGC
CC U G G G
U UCG
AAUCCCAG
UGGGUCCA
Robert Giegerich RNA Structure SS 2009
RNAStructure SS
2009
RobertGiegerich
Motivation
Lost in FoldingSpace
Abstractioncomes to rescue
The idea ofabstractshapes
The general idea
Defining shapeabstractions
Properties of theshape space
Simple shapeanalysis
The toolRNAshapes
Completeprobabilisticshape analysis
ShapeProbabilitites
The RNAshapespackage
Shape Space Statistics
Is the shape space really smaller than the folding space?See some statistics within 5% kcal/mol of MFE:
Robert Giegerich RNA Structure SS 2009
RNAStructure SS
2009
RobertGiegerich
Motivation
Lost in FoldingSpace
Abstractioncomes to rescue
The idea ofabstractshapes
The general idea
Defining shapeabstractions
Properties of theshape space
Simple shapeanalysis
The toolRNAshapes
Completeprobabilisticshape analysis
ShapeProbabilitites
The RNAshapespackage
0
50
100
150
200
250
300
350
400
0 50 100 150 200 250 300
Nr.
of S
truct
ures
/Sha
pes
Sequence length [nt]
ShapesStructures
Robert Giegerich RNA Structure SS 2009
RNAStructure SS
2009
RobertGiegerich
Motivation
Lost in FoldingSpace
Abstractioncomes to rescue
The idea ofabstractshapes
The general idea
Defining shapeabstractions
Properties of theshape space
Simple shapeanalysis
The toolRNAshapes
Completeprobabilisticshape analysis
ShapeProbabilitites
The RNAshapespackage
1
10
100
1000
10000
100000
1e+06
0 50 100 150 200 250 300
Nr.
of S
truct
ures
/Sha
pes
Sequence length [nt]
ShapesStructures
Robert Giegerich RNA Structure SS 2009
RNAStructure SS
2009
RobertGiegerich
Motivation
Lost in FoldingSpace
Abstractioncomes to rescue
The idea ofabstractshapes
The general idea
Defining shapeabstractions
Properties of theshape space
Simple shapeanalysis
The toolRNAshapes
Completeprobabilisticshape analysis
ShapeProbabilitites
The RNAshapespackage
0.0001
0.001
0.01
0.1
1
0 50 100 150 200 250 300
Rat
io o
f Sha
pes
to S
truct
ures
Sequence length [nt]
Robert Giegerich RNA Structure SS 2009
RNAStructure SS
2009
RobertGiegerich
Motivation
Lost in FoldingSpace
Abstractioncomes to rescue
The idea ofabstractshapes
The general idea
Defining shapeabstractions
Properties of theshape space
Simple shapeanalysis
The toolRNAshapes
Completeprobabilisticshape analysis
ShapeProbabilitites
The RNAshapespackage
1e-05
0.0001
0.001
0.01
0.1
1
0 2 4 6 8 10
Rat
io o
f Sha
pes
to S
truct
ures
Energy range above mfe [kcal/mol]
Robert Giegerich RNA Structure SS 2009
RNAStructure SS
2009
RobertGiegerich
Motivation
Lost in FoldingSpace
Abstractioncomes to rescue
The idea ofabstractshapes
The general idea
Defining shapeabstractions
Properties of theshape space
Simple shapeanalysis
The toolRNAshapes
Completeprobabilisticshape analysis
ShapeProbabilitites
The RNAshapespackage
0.01
1
100
10000
1e+06
1e+08
1e+10
1e+12
1e+14
1e+16
1e+18
0 20 40 60 80 100 120
Nr.
of S
truct
ures
/Sha
pes
Sequence length N [nt]
StructuresShapes
0.0391 * 1.3968912N
0.2064 * 1.1067094N
Robert Giegerich RNA Structure SS 2009
RNAStructure SS
2009
RobertGiegerich
Motivation
Lost in FoldingSpace
Abstractioncomes to rescue
The idea ofabstractshapes
The general idea
Defining shapeabstractions
Properties of theshape space
Simple shapeanalysis
The toolRNAshapes
Completeprobabilisticshape analysis
ShapeProbabilitites
The RNAshapespackage
Variation within shape
How homogenous are shape classes?
Robert Giegerich RNA Structure SS 2009
RNAStructure SS
2009
RobertGiegerich
Motivation
Lost in FoldingSpace
Abstractioncomes to rescue
The idea ofabstractshapes
The general idea
Defining shapeabstractions
Properties of theshape space
Simple shapeanalysis
The toolRNAshapes
Completeprobabilisticshape analysis
ShapeProbabilitites
The RNAshapespackage
Variation within shape
How homogenous are shape classes?
Robert Giegerich RNA Structure SS 2009
RNAStructure SS
2009
RobertGiegerich
Motivation
Lost in FoldingSpace
Abstractioncomes to rescue
The idea ofabstractshapes
The general idea
Defining shapeabstractions
Properties of theshape space
Simple shapeanalysis
The toolRNAshapes
Completeprobabilisticshape analysis
ShapeProbabilitites
The RNAshapespackage
RNAshapes: Best k shreps
Björn Voß
[] [[][]] [[][][]]
RNAshapes
Robert Giegerich RNA Structure SS 2009
RNAStructure SS
2009
RobertGiegerich
Motivation
Lost in FoldingSpace
Abstractioncomes to rescue
The idea ofabstractshapes
The general idea
Defining shapeabstractions
Properties of theshape space
Simple shapeanalysis
The toolRNAshapes
Completeprobabilisticshape analysis
ShapeProbabilitites
The RNAshapespackage
The tool RNAshapes
The tool RNAshapes
classifies structures by abstract shape
computes a small number of representative structures
no heuristics involved
as fast as traditional RNA folding
Available athttp://bibiserv.techfak.uni-bielefeld.de/RNAshapes/
Robert Giegerich RNA Structure SS 2009
RNAStructure SS
2009
RobertGiegerich
Motivation
Lost in FoldingSpace
Abstractioncomes to rescue
The idea ofabstractshapes
The general idea
Defining shapeabstractions
Properties of theshape space
Simple shapeanalysis
The toolRNAshapes
Completeprobabilisticshape analysis
ShapeProbabilitites
The RNAshapespackage
Complete probabilistic shape analysis
“How much would you trust a structure with aprobability of 0.1 ∗ 10−4, even when it is optimal?”
Chip Lawrence, Benasque 2003
Robert Giegerich RNA Structure SS 2009
RNAStructure SS
2009
RobertGiegerich
Motivation
Lost in FoldingSpace
Abstractioncomes to rescue
The idea ofabstractshapes
The general idea
Defining shapeabstractions
Properties of theshape space
Simple shapeanalysis
The toolRNAshapes
Completeprobabilisticshape analysis
ShapeProbabilitites
The RNAshapespackage
From energy to probability
According to Boltzmann statistics, sequence s has structure xwith probability
Prob(x) = (e−Ex/RT )/Q
where T is temperature, R universal gas constant, andQ the “partition function”, Q =
∑x∈F (s) e−Ex/RT
Accumulated shape probabilities
Prob(p) =∑
π(x)=p Prob(x) for all p ∈ P(s)
Robert Giegerich RNA Structure SS 2009
RNAStructure SS
2009
RobertGiegerich
Motivation
Lost in FoldingSpace
Abstractioncomes to rescue
The idea ofabstractshapes
The general idea
Defining shapeabstractions
Properties of theshape space
Simple shapeanalysis
The toolRNAshapes
Completeprobabilisticshape analysis
ShapeProbabilitites
The RNAshapespackage
RNAshapes package
Overtaking: Shape probabilities contradict energy ranking
[ ]E= -22.90 kcal/mol
P= 0.2370279
[ ][ ][ ]E= -22.50 kcal/mol
P= 0.0999191
[ ][ ]E= -22.30 kcal/mol
P= 0.5511424
Gets 2nd Gets 3rd
Gets 1stBjörn Voß
Robert Giegerich RNA Structure SS 2009
RNAStructure SS
2009
RobertGiegerich
Motivation
Lost in FoldingSpace
Abstractioncomes to rescue
The idea ofabstractshapes
The general idea
Defining shapeabstractions
Properties of theshape space
Simple shapeanalysis
The toolRNAshapes
Completeprobabilisticshape analysis
ShapeProbabilitites
The RNAshapespackage
A propos “complete”
probabilities give full information about folding space
we cannot compute only the k most likely shapes
only feasible up to 300 nts
Robert Giegerich RNA Structure SS 2009
RNAStructure SS
2009
RobertGiegerich
Motivation
Lost in FoldingSpace
Abstractioncomes to rescue
The idea ofabstractshapes
The general idea
Defining shapeabstractions
Properties of theshape space
Simple shapeanalysis
The toolRNAshapes
Completeprobabilisticshape analysis
ShapeProbabilitites
The RNAshapespackage
Algorithmics
Complete probabilistic shape analysis
requires a non-ambiguous grammar with correct dangles atall places
applies “classified” dynamic programming
takes time O(1.1n ∗ n3) where n = |s|
Robert Giegerich RNA Structure SS 2009
RNAStructure SS
2009
RobertGiegerich
Motivation
Lost in FoldingSpace
Abstractioncomes to rescue
The idea ofabstractshapes
The general idea
Defining shapeabstractions
Properties of theshape space
Simple shapeanalysis
The toolRNAshapes
Completeprobabilisticshape analysis
ShapeProbabilitites
The RNAshapespackage
Results from complete probabilistic analysis
Some observations:
Sequence Shape 1 Prob. Shape 2 Prob.lin-4 precursor [] 0.99999994tRNA-ala [] 0.989744 [[]] 0.008994typical mRNA [][[][]] 0.432154 [[[][]][]] 0.149831HIV-1 Leader [][[][[][]]]] 0.6164 [][[[][[][]]][]] 0.3492
Robert Giegerich RNA Structure SS 2009
RNAStructure SS
2009
RobertGiegerich
Motivation
Lost in FoldingSpace
Abstractioncomes to rescue
The idea ofabstractshapes
The general idea
Defining shapeabstractions
Properties of theshape space
Simple shapeanalysis
The toolRNAshapes
Completeprobabilisticshape analysis
ShapeProbabilitites
The RNAshapespackage
Summary on abstract shape analysis (1)
Shape representatives
are cheap to compute
give small but representative sample of potential structures
Robert Giegerich RNA Structure SS 2009
RNAStructure SS
2009
RobertGiegerich
Motivation
Lost in FoldingSpace
Abstractioncomes to rescue
The idea ofabstractshapes
The general idea
Defining shapeabstractions
Properties of theshape space
Simple shapeanalysis
The toolRNAshapes
Completeprobabilisticshape analysis
ShapeProbabilitites
The RNAshapespackage
Summary on abstract shape analysis (2)
Shape probabilities
provide the same representatives
give a measure of well-definedness of folding
independent of sequence composition and sequence length(in contrast to MFE values)
exclude further alternatives when probability is 90%covered
are more expensive to compute
require an exact solution of the dangling base problem
Robert Giegerich RNA Structure SS 2009
RNAStructure SS
2009
RobertGiegerich
Motivation
Lost in FoldingSpace
Abstractioncomes to rescue
The idea ofabstractshapes
The general idea
Defining shapeabstractions
Properties of theshape space
Simple shapeanalysis
The toolRNAshapes
Completeprobabilisticshape analysis
ShapeProbabilitites
The RNAshapespackage
References on abstract shape analysis
Abstract Shapes of RNA by Giegerich, Voss, Rehmsmeier.Nucleic Acids Research 2004, Vol. 32, No 15, 1 - 9.
Complete Probabilistic Analysis of RNA Abstract Shapesby Voss, Giegerich, Rehmsmeier. BMC Biology, 2006, Feb15;4(1):5
RNAshapes: an integrated RNA analysis package based onabstract shapes. Steffen P, Voss B, Rehmsmeier M, ReederJ, Giegerich R. Bioinformatics 2006, Feb 15;22(4):500-3.
RNAsifter:Shape based indexing to speed up Rfamsearches by Voss,Janssen, Reeder, Giegerich. BMCBioinformatics, 2007.
Robert Giegerich RNA Structure SS 2009
RNAStructure SS
2009
RobertGiegerich
Motivation
Lost in FoldingSpace
Abstractioncomes to rescue
The idea ofabstractshapes
The general idea
Defining shapeabstractions
Properties of theshape space
Simple shapeanalysis
The toolRNAshapes
Completeprobabilisticshape analysis
ShapeProbabilitites
The RNAshapespackage
The End
Thanks for your attention.
Robert Giegerich RNA Structure SS 2009