Of Mice and Men Learning from genome reversal findings Genome Rearrangements in Mammalian Evolution:...
-
date post
21-Dec-2015 -
Category
Documents
-
view
219 -
download
1
Transcript of Of Mice and Men Learning from genome reversal findings Genome Rearrangements in Mammalian Evolution:...
Of Mice and MenLearning from genome reversal findings
Genome Rearrangements in Mammalian Evolution: Lessons From Human and Mouse Genomes
and
Transforming Men into Mice: the Nadeau-Taylor Chromosomal Breakage Model Revisited
both papers written by Pavel Pevzner and Glenn Tesler
Reversal Distance – The minimum number of reversals to translate from one genome to another
Syntney Block – region in which the same gene order is observed between species
Ortholog – corresponding gene in two different species
Basic, Basic Terms
Overview
● Theory of reversal distance calculation
● A new model for presenting reversal information (primary topic of Genome Rearrangements in Mammalian Evolution)
● Evidence of “fragile” genome regions (primary topic of (Transforming Men into Mice)
What are we solving?
Find d(), the reversal distance, from permutation
to permutation where
is L 4 5 2 1 3 6 R
is L 1 2 3 4 5 6 R
Reversals
A reversal operation, , is defined as follows:
= [ i , j ]
(k) = { ( i + j - k ) if i < k < j, (k) otherwise}
Breakpoints
L 1 3 2 4 5 6 R
A breakpoint of with respect to is a pair x, y of elements of Lº such that xy appears in the extended version of , but neither xy nor the reverse pair yx
appear in the extended .
Reversal Distance: Guess #1
d() > b() / 2
...we can do better than that!
Reality and Desire Construction
Extended
Reality and Desire Edges
Terminals
Reality Edges
Reality and Desire Diagram
Reality and Desire Diagram - RD()
c() = # of Cycles
Reversal Distance: Guess #2
d() > n + 1 - c()
Try taking a closer look...
Components
Component – set of interleaving cycles (cycles which cross in a reality and desire diagram)
This reality and desire diagram has six components.
You Are Here
Converging and Diverging
● Edges A, C, and E, converge
● Edges D and F diverge● Edges B and D diverge● Edges F and B converge
Converge? Diverge? So what?
Let = [ e , f ] and act on RD()...
If edges e and f belong to different cycles,then c() = c() – 1
If edges e and f belong to the same cycle and converge,then c() = c()
If edges e and f belong to the same cycle and diverge,then c() = c() + 1
The Good and the Bad
Good Components contain at least one Good Cycle.
Bad Components contain only Bad Cycles.
Good Cycles contain at least one pair of diverging edges.
Bad Cycles contain only converging edges.
Some “Bad” Examples
This reality and desire diagram has five bad components and only one good component (bottom).
The good component has one good cycle and one bad cycle.
You Are Here
Hurdles
Hurdle – a bad component that does not separate any other two bad components
Nonhurdle – a bad component that does separate at least two bad components
Example Hurdles
In this example...● A, F, C, and D are
hurdles.● E and B are nonhurdles.● h() = 4
You Are Here
Super/Simple Hurdles
In this example...● Hurdle F protects
nonhurdle E● F is a super hurdle● A, C, and D are simple
hurdles● h() = 4
The Fortress
Fortress – A permutation whose reality and desire diagram contains an odd number of hurdles and all of them are super hurdles.
f( = 1{ is a fortress}
Example Fortress
Smallest Possible fortress:
Reversal Distance: Guess #3
d() = n + 1 – c() + h() + f()
Finally!!!
References
The preceding material was taken from Introduction to Computational Molecular Biology by Setubal and Meidanis, based on the following papers:
●V. Bafna and P. A. Pevzner – Genome rearrangements and sorting by reversals.●S. Hannenhalli and P. A. Pevzner – Transforming cabbage into turnip (polynomial algorithm for sorting signed permutations by reversals) (this paper referenced in text of Transforming Men into Mice for definitions of hurdles and fortresses)●J. D. Kececioglu and D. Sankoff – Exact and approximate algorithms for sorting by reversals with application to genome rearrangement
First Paper
Genome Rearrangements in Mammalian Evolution: Lessons From Human and Mouse Genomes
Pavel Pevzner and Glenn Tesler
First Paper Overview
This paper presents a new kind of graph which achieves the usefulness of reality and desire
diagrams on simple genome comparison graphs.
The ProcessGRIMM-Synteny Algorithm
Useful features:
● Same cycle count as reality desire diagram!
● Cycles of more than for edges indicate reused breakpoints!
WholeGenome
ResultsSynteny Blocks: 281Reversal Distance: 245
Second Paper
Transforming Men into Mice: the Nadeau-Taylor Chromosomal Breakage Model Revisited
Pavel Pevzner and Glenn Tesler
Second Paper Overview
Are breakpoints random or are some sections of the genome more “fragile” than others?
Conventional Wisdom
“Since the [random breakage] model was first introduced in [paper cited]..., it has been analyzed by Nadeau and others [more papers cited]... and has become widely accepted”
To test, simply plot the lengths of known conserved segments and compare to an exponential distribution...
Do we have a match?
Too many short segments!
Micro-rearrangement Evidence
● There is evidence of at least 3,170 micro-rearrangements (reversals) within the synteny blocks (though many may be artifacts of incorrect assemblies)
● 41 out of 281 synteny blocks do not show any evidence of micro-rearrangements, while 10 synteny blocks are extremely rearranged (40 or more rearrangements within a block)
Calculating Breakpoint Reuse
Theorem 1: “If all reversals are delimited by pairs of breakpoints, the number of breakpoint re-uses in any parsimonious reversal scenario is 2d - br. This is the lower bound for non-optimal reversal scenarios.”
2 x 245 (Distance) – 300 Breakpoints = = 190 breakpoint reuses
281 Synteny Blocks – 23 Chromosomes +190 Breakpoint Reuses = 448 Breakages
Statistical Evidence
Expected number of “clumps” (pairs of points within a space w, which is a fraction of genome length) is (n – 1)(1 – (1 – w)n), where n is the number of breakages.
For w = 0.668Mb/2,983Mb, the number of expected “clumps” is about 43, far less than the 190 number of reused breakpoints!
Questions?