Introduction Sorting permutations with reversals in order to reconstruct evolutionary history of...

34
Introduction Sorting permutations with reversals in order to reconstruct evolutionary history of genome Reversal mutations occur often in chromosomes where each reverses the order of an interval of genes A shortest reversal sequence sorting one genome to another corresponds to the most likely evolutionary path between them
  • date post

    19-Dec-2015
  • Category

    Documents

  • view

    224
  • download

    1

Transcript of Introduction Sorting permutations with reversals in order to reconstruct evolutionary history of...

Introduction

Sorting permutations with reversals in order to reconstruct evolutionary history of genome

Reversal mutations occur often in chromosomes where each reverses the order of an interval of genes

A shortest reversal sequence sorting one genome to another corresponds to the most likely evolutionary path between them

Introduction

Sorting permutations and circular permutations using as few fixed-length reversals as possible

Limiting the the transformations to reversals of length exactly k can be very restrictive

Can k-reversal sort? Can the permutation {1,3,2,4,5} be sorted

using k-reversals ? k=1 ? well …… k=2 ? {1,3,2,4,5} Bubble sort k=3 ? k=4 ? later on

Sorting {1,3,2,4,5} with k=3

Since 1 and 2 are separated by an odd number of items and any 3-reversal change this distance by either 0 or 2 it cannot be done !!!

{2,3,1,4,5} – distance change 0 {1,4,2,3,5} – distance change 0 {1,3,5,4,2} – distance change 2

Sorting {1,3,2,4,5} with k=3

{2,3,1,4,5} – distance change 0 {1,4,2,3,5} – distance change 0 {1,3,5,4,2} – distance change 2 3-rev can change position of odd elements

only and even elements only 3-rev is actually bubble sort for odd/even

elements inside the permutation

Notation

PG(k,n) – permutation group of size n using k-reversals

The k-reversal operation on a permutation starting the n element Rev(i):

{1,…,i-1,i+k-1, i+k-2,…, i+1,i,i+k,…,n} d – the diameter :max{ shortest path in cayley graph } orminimum reversals to get from p to q

Notation

The Cayley graph is the graph whose vertices are the elements of G, with an edge between vertices p and q iff

Cayley graph of PG(3,4):

qgp i

13241324 23142314

2413241314231423

13421342 43124312

4213421312431243

12341234 32143214

3412341214321432

21342134 31243124

3421342124312431

34213421 24312431

2134213431243124

32413241 42314231

4132413231423142

Equivalent Transformations in PG(k,n)

4l–reversal ↔ 4–reversal ↔ ζ1,2 ,ζ2,1

(2+4l)–reversal ↔ 2–reversal ↔ ζ1,1

(3+4l)–reversal ↔ 3–reversal (5+8l)–reversal ↔ 5–reversal ↔ ζ2,2

(9+8l)–reversal ↔ 9–reversal ↔ ζ2,4 ,ζ4,2

ζ1,2 ,ζ2,1 4–reversal :

ζ1,2 (1) : {1,2,3,4,…} {2,3,1,4,…} ζ2,1 (2) : {2,3,1,4,…} {2,4,3,1,…} ζ1,2 (1) : {2,4,3,1,…} {4,3,2,1,…{

4–reversal ζ1,2 ,ζ2,1

lemma: 4-rev ζ2,3 , ζ3,2 {1,2,3,4,5,6,7} → {1,5,4,3,2,6,7} → {3,4,5,1,2,6,7} for ζ3,2 we simply reverse these operations

lemma: ζ2,3 , ζ3,2 ζ1,4 , ζ4,1 {1,2,3,4,5,6,7} → {3,4,5,1,2,6,7} → {5, 1,2,3,4,6,7} for ζ4,1 we simply use ζ3,2 with same operations

lemma: ζ1,4 , ζ4,1 ζ1,2 ,ζ2,1

{1,2,3,4,5,6,7} → {1,3,4,5,6,2,7} → {1,4,5,6,2,3,7} → {2,1,4,5,6,3,7} → {2,3,1,4,5,6,7}

The problem: Given a graph PG(k,n):

How many connected components are there? Equiv to: what is the size of any connected

component? What is the diameter of each component?

Assume n≥k+2 If k=n there are n!/2 components If k=n-1 there are n or 2n components, depending

upon parity n=3 {(1,2,3) (1,3,2) (2,1,3) (2,3,1) (3,1,2) (3,2,1)} n=4 {(1,2,3,4) (1,4,3,2) (3,2,1,4) (3,4,1,2)}

The simple cases

How many connected components are in PG(2,n)?

1 component (the graph is connected) How many connected component are in

PG(3,n)? there is only a choice of n/2 elements for odd/even

places, and therefore components

2

n

n

The number of connected components in PG(k,n)

K≈0 mod 4 ?

K≈5 mod 8 ?

K≈1 mod 8 ?

K≈2 mod 4 1

k≈3 mod 42

n

n

Connected components -Sign of permutation (4-rev)

The sign of permutation is pair is disordered if i<j and ai>aj

Lemma : ζ1,2 ,ζ2,1 do not change the sign of a permutation

ζ2,1(i)= ζ1,2(i) ζ1,2 (i)

x,y,zy,z,x sign (z-y) (z-x) (y-x) = sign (z-y) (x-z) (x-y)

1 2{ , ,...., }na a a ( )j ii j

a a

ji aa ,

#( ) 2( )

disorders msign

else

Connected components (4-rev)

Lemma : ζ1,2 ,ζ2,1 cannot change the sign of a permutation. The identity permutations has + sign, so permutations with – sign cannot be sorted.

Lemma: ζ2,1 can sort only half of all permutations ζ2,1 ζ2m,1 for i=1 to n-2find j such that aJ = i

if (j – i) is even, apply ζj - i,1 ) i )

else apply ζj – i - 1,1 ) i+1) then ζ1,2 ) i )

end for

example

for i=1 to n-2find j such that aJ = i

if (j – i) is even, apply ζj - i,1 ) i )

else apply ζj – i - 1,1 ) i+1) then ζ1,2 ) i )

end for 1 2 3 4 5

i = 1,j = 4, j – i=3 ζ2,1 ) 2)

34512

example

for i=1 to n-2find j such that aJ = i

if (j – i) is even, apply ζj - i,1 ) i )

else apply ζj – i - 1,1 ) i+1) then ζ1,2 ) i )

end for 1 2 3 4 5

ζ2,1 ) 2) ζ1,2 ) 1)

31452

example

for i=1 to n-2find j such that aJ = i

if (j – i) is even, apply ζj - i,1 ) i )

else apply ζj – i - 1,1 ) i+1) then ζ1,2 ) i )

end for 1 2 3 4 5

14352

example

for i=1 to n-2find j such that aJ = i

if (j – i) is even, apply ζj - i,1 ) i )

else apply ζj – i - 1,1 ) i+1) then ζ1,2 ) i )

end for 1 2 3 4 5

i = 2, j = 5, j – i=3 ζ2,1 ) 3 )

14352

example

for i=1 to n-2find j such that aJ = i

if (j – i) is even, apply ζj - i,1 ) i )

else apply ζj – i - 1,1 ) i+1) then ζ1,2 ) i )

end for 1 2 3 4 5

ζ2,1 ) 3 ) ζ1,2 ) 2 )

14235

example

for i=1 to n-2find j such that aJ = i

if (j – i) is even, apply ζj - i,1 ) i )

else apply ζj – i - 1,1 ) i+1) then ζ1,2 ) i )

end for 1 2 3 4 5

12345

Connected components (4-rev)

the i th iteration places aJ into i th position, where aJ = i

at termination, either

because they have different signs, using prev lemma we know cannot be transformed into using ζ1,2

thus, ζ1,2 divides the permutation group into 2 equal size sub-groups, and ζ1,2 sorts just half of all permutations

{1,2,...., 2, , 1} {1,2,...., 2, 1, }n n n or n n n

{1,2,...., 2, , 1}n n n {1,2,...., 2, 1, }n n n

The number of connected components in PG(k,n)

K≈0 mod 4 2

K≈5 mod 8

K≈1 mod 8

K≈2 mod 4 1

k≈3 mod 42

n

n

22

n

n

42

n

n

Circular permutations -Notation

CPG(k,n) – circular permutation group of size n using k-reversals

Each permutation in CPG(n) represents a set of n permutations on PG(n) equivalent under the shift operation

{1,2,3,4} = { (1,2,3,4) , (2,3,4,1) , (3,4,1,2) , (4,1,2,3) }

Any permutation can be rearranged to exactly n arrangements by shift

PG(n) has n! permutations CPG(n) has n!/n = (n-1)! permutations

Notation

The Cayley graph is the graph whose vertices are the elements of CPG, with an edge between vertices p and q iff

Cayley graph of CPG(3,4):

qgp i

13241324 23142314

2413241314231423

13421342 43124312

4213421312431243

12341234 32143214

3412341214321432

21342134 31243124

3421342124312431

34213421 24312431

2134213431243124

32413241 42314231

4132413231423142

Notation

The Cayley graph is the graph whose vertices are the elements of CPG, with an edge between vertices p and q iff

Cayley graph of CPG(3,4):

qgp i

13241324

14231423

13421342

12431243

12341234

14321432

Equivalent Transformations in CPG(k,n)

All PG(k,n) transformations hold for n > k+2 4l–reversal ↔ 4–reversal ↔ ζ1,2 ,ζ2,1

(2+4l)–reversal ↔ 2–reversal ↔ ζ1,1

(3+4l)–reversal ↔ 3–reversal (5+8l)–reversal ↔ 5–reversal ↔ ζ2,2

(9+8l)–reversal ↔ 9–reversal ↔ ζ2,4 ,ζ4,2

The problem:

Given a graph CPG(k,n): How many connected components are there?

Equiv to: what is the size of any connected component?

What is the diameter of each component? Assume n≥k+2

If k=n or k=n-1 there are (n-1)!/2 components Since all PG(k,n) transformations hold:

# Components in CPG(n) ≤ # Components in PG(n)

Connected comp. of CPG(k,n) for even k

Recall:How many connected components are in PG(2,n)?

1 component (the graph is connected) same in CPG(2,n) – holds for all n

Connected comp. of CPG(k,n) for even k & even n

CPG(4l,2m) is connected (a single component)

Proof: Recall that:

4 – reversals → ς1,2, ς2,1

ς1,2, ς2,1 sort all permutations to {1,...,2m-1,2m} or

{1,…,2m,2m-1} ς1,2 can sort circular permutation {1,…,2m,2m-

1} to {1,...,2m-1,2m} : 1,2,3,4,6,5 → 5,1,2,3,4,6 (shift) 5,1,2,3,4,6 → 1,2,5,3,4,6 → 1,2,3,4,5,6

Connected comp. of CPG(k,n) for even k & odd n

Recall:4–reversals do not change the sign of permutations.

If n is odd a shift operation doesn’t change the sign x1,…,x2m,x2m+1 → x2m+1,x1,…,x2m

2m = even #(disorders) We can use the algorithm 4l-reversals sorts half of CPG(k,n)

Connected comp. of CPG(k,n) for even k

So far:

12

/ 2

n

n

k\nn=2m+1n=2mn=2 mod 4

k=0 mod 4

211

k=2 mod 4

111

k=5 mod 8

2

k=1 mod 8

2

k=3 mod 4

1

12

/ 2

n

n

12

/ 2

n

n

14

/ 2

n

n

1

/ 2

n

n

1

/ 2

n

n

Diameter of CPG(k,n) bounds

Upper bound =O(n2/k +nk) Lower bound = Ω(n2/k2+n)

Conclusions & Open problems

A complete answer to the connectedness question of the Cayley Graphs for permutations and circular permutations

Bounds to the diameter of CPG(k,n) Can we tighten these bounds ? What is the diameter of PG(k,n) ? What happens with signed permutations where

each element has 2 possible orientations ? What happens if we allow numerous

reversals ?