Accurate Scaffolding of Large Genomes using Integer

1
Accurate Scaffolding of Large Genomes using Integer Programming and Non-Serial Dynamic Programming James Lindsay, Ion Mandoiu (University of Connecticut) Hamed Salooti , Alex Zelikovsky (Georgia State University) test case bundle method orien good orien bad sensi- tivity ppv mcc 1% 2 mip 3037 1 54.47% 99.41% 73.58% 1% 2 opera 3038 0 78.96% 99.88% 88.80% 1% 2 uconn nsdp 3036 1 79.00% 99.88% 88.83% 1% 2 uconn nsdp-h 3036 1 77.33% 99.88% 87.88% 1% 1 mip 3027 11 56.38% 97.58% 74.17% 1% 1 opera 3026 12 80.59% 97.96% 88.85% 1% 1 uconn nsdp 3000 37 79.03% 94.58% 87.27% 1% 1 uconn nsdp-h 3033 4 77.61% 99.64% 87.93% 10% 2 mip 30811 28 58.98% 98.88% 76.37% 10% 2 opera 30799 40 81.06% 99.17% 89.66% 10% 2 uconn nsdp 30774 64 81.29% 98.68% 89.56% 10% 2 uconn nsdp-h 30806 32 79.68% 99.25% 88.93% 10% 1 mip 30468 371 57.94% 90.70% 72.49% 10% 1 opera x x x x x 10% 1 uconn nsdp 10% 1 uconn nsdp-h 30701 137 80.03% 97.92% 88.52% 50% 2 mip - - - - - 50% 2 opera 148972 906 80.25% 96.87% 88.17% 50% 2 uconn nsdp x x x x x 50% 2 uconn nsdp-h 50% 1 mip 144332 5546 51.29% 72.60% 61.02% 50% 1 opera x x x x x 50% 1 uconn nsdp x x x x x 50% 1 uconn nsdp-h 100% 2 mip x x x x x 100% 2 opera x x x x x 100% 2 uconn nsdp x x x x x 100% 2 uconn nsdp-h 301268 2081 68.09% 96.20% 80.93% RESULTS 4x assembly on Venter genome test set bundle connec ted biconn ected tricon nected 1% 3 20 6 0 1% 2 20 6 0 1% 1 42 6 0 10% 3 50 14 10 10% 2 63 15 10 10% 1 25821 15996 6986 25% 3 43 14 11 25% 2 110 22 16 25% 1 72124 61925 45856 50% 3 48 16 10 50% 2 519 24 13 50% 1 147445 140796 127892 100% 3 42 16 11 100% 2 575 20 12 100% 1 278143 273117 262622 test set bundle connec ted biconn ected tricon nected 1% 3 20 6 0 1% 2 5 0 0 1% 1 6 0 0 10% 3 50 14 10 10% 2 6 0 0 10% 1 12 4 0 25% 3 43 14 11 25% 2 6 0 0 25% 1 40 4 0 50% 3 48 16 10 50% 2 8 4 0 50% 1 28038 2935 252 100% 3 42 16 11 100% 2 15 6 0 100% 1 192663 95152 35344 Component Sizes 4x assembly on Venter genome hierarchy no hierarchy

description

Accurate Scaffolding of Large Genomes using Integer Programming and Non-Serial Dynamic Programming. James Lindsay, Ion Mandoiu ( University of Connecticut ) Hamed Salooti , Alex Zelikovsky (Georgia State University ). RESULTS. Component Sizes. 4x assembly on Venter genome. - PowerPoint PPT Presentation

Transcript of Accurate Scaffolding of Large Genomes using Integer

Page 1: Accurate  Scaffolding of Large Genomes using Integer

Accurate Scaffolding of Large Genomes using Integer Programming and Non-Serial Dynamic Programming

James Lindsay, Ion Mandoiu (University of Connecticut)Hamed Salooti , Alex Zelikovsky (Georgia State University)

test case bundle method orien good

orien bad

sensi-tivity

ppv mcc

1% 2 mip 3037 1 54.47% 99.41% 73.58%

1% 2 opera 3038 0 78.96% 99.88% 88.80%

1% 2 uconn nsdp 3036 1 79.00% 99.88% 88.83%

1% 2 uconn nsdp-h 3036 1 77.33% 99.88% 87.88%

1% 1 mip 3027 11 56.38% 97.58% 74.17%

1% 1 opera 3026 12 80.59% 97.96% 88.85%

1% 1 uconn nsdp 3000 37 79.03% 94.58% 87.27%

1% 1 uconn nsdp-h 3033 4 77.61% 99.64% 87.93%

10% 2 mip 30811 28 58.98% 98.88% 76.37%

10% 2 opera 30799 40 81.06% 99.17% 89.66%

10% 2 uconn nsdp 30774 64 81.29% 98.68% 89.56%

10% 2 uconn nsdp-h 30806 32 79.68% 99.25% 88.93%

10% 1 mip 30468 371 57.94% 90.70% 72.49%

10% 1 opera x x x x x

10% 1 uconn nsdp

10% 1 uconn nsdp-h 30701 137 80.03% 97.92% 88.52%

50% 2 mip - - - - -

50% 2 opera 148972 906 80.25% 96.87% 88.17%

50% 2 uconn nsdp x x x x x

50% 2 uconn nsdp-h

50% 1 mip 144332 5546 51.29% 72.60% 61.02%

50% 1 opera x x x x x

50% 1 uconn nsdp x x x x x

50% 1 uconn nsdp-h

100% 2 mip x x x x x

100% 2 opera x x x x x

100% 2 uconn nsdp x x x x x

100% 2 uconn nsdp-h 301268 2081 68.09% 96.20% 80.93%

RESULTS4x assembly on Venter genome

test set bundle connected

biconnected

triconnected

1% 3 20 6 0

1% 2 20 6 0

1% 1 42 6 0

10% 3 50 14 10

10% 2 63 15 10

10% 1 25821 15996 6986

25% 3 43 14 11

25% 2 110 22 16

25% 1 72124 61925 45856

50% 3 48 16 10

50% 2 519 24 13

50% 1 147445 140796 127892

100% 3 42 16 11

100% 2 575 20 12

100% 1 278143 273117 262622

test set bundle connected

biconnected

triconnected

1% 3 20 6 0

1% 2 5 0 0

1% 1 6 0 0

10% 3 50 14 10

10% 2 6 0 0

10% 1 12 4 0

25% 3 43 14 11

25% 2 6 0 0

25% 1 40 4 0

50% 3 48 16 10

50% 2 8 4 0

50% 1 28038 2935 252

100% 3 42 16 11

100% 2 15 6 0

100% 1 192663 95152 35344

Component Sizes4x assembly on Venter genome

hierarchy

no hierarchy