WGP Tomato EU-SOL meeting July 15, 2009 Antoine Janssen.

25
WGP Tomato EU-SOL meeting July 15, 2009 Antoine Janssen

Transcript of WGP Tomato EU-SOL meeting July 15, 2009 Antoine Janssen.

WGP TomatoEU-SOL meeting July 15, 2009

Antoine Janssen

overviewWhole Genome Profiling

Whole Genome Profiling: the concept POP in Arabidopsis WGP melon Combining WGP and WGS WGP Tomato

Whole Genome Profiling:

Sequence-based physical mapping BAC clones using Illumina Genome Analyzer (Solexa)

Next-generation sequencing technologies have accelerated whole genome re-sequencing approaches and reduced their costs dramatically

but,

de novo construction of genomes in complex organisms is still costly

therefore,

An improved de novo draft genome sequencing strategy is needed taking full advantage of the power of next-generation sequencing

The challengeWhole Genome Profiling

BAC libraries

- BACs 125 kb average insert size, covering 5-20 times the genome (GE)

Chromosome

BAC1BAC3

BAC5BAC4

BAC2

Whole Genome Profiling

TTAA……ACTTAGTTAGCTTGGACTAACGAATTCGTAGGCATAGTGACTAGCATTG…..……TTAA

EcoRIMseI MseI

Restriction fragmentsWhole Genome Profiling

Arabidopsis Genome – 125 Mbp 6144 BACs (5 GE) in 384 well plates Each Illumina GA lane:

• 768 BACs ~ 3 M reads Total 8 lanes

Individual BAC target preparation is too time consuming/costly Therefore: BAC 2D pooling Each pool identified by unique sample identification tag

Pooling BAC clones

R1 - CTACT

R2 - CAGGT

R3 - GCATC

R4 - TGCAG

R5 - TACTA

R6 - CCTAG

TC

TG

T -

AG

AC

T -

GA

GT

C -

GT

GC

A -

AT

CA

C -

GT

AT

C -

384 wells plate =

384 BACs

column pools

row pools

19 20 21 22 23 24

Whole Genome Profiling

Illumina Genome Analyzer

GTTGACAATTCAGTATTACTAGAGCTGCGTATAACTCAATTCTCATCAGAAAAAATATATGATACGCCAATTCCAAAAGATTGTAGAGTATTTGCCATCAATTCCAGTTTTGGGTTCTTCACCAGAAGTCAATTCAATTCCTTCAATCATCACACGGTCGCAATTCAGTGTAAAAAGAAATTTCTAGCGGTCAATTCATGGTATGATGATAATAAGTTAACTCAATTCGAGTTCAGTTTGTTACTACTGCGGTCAATTCGGCAGAAGACACCAGTAACATAAGGCAATTCATAGGTATAGGAAGAAGCCCTAACTCAATTCGTTCATACTTACACATGCATTATTACAATTCTATATGATTTGTTACAGGTTGAATACAATTCAAAACCCTAATCACACGCACTGAACCAATTCCACAATCGATGCTTATGAAATATTACAATTCCAGTGGTCTAACAAGTAAACGTTGACAATTCTTCATTCGAGTATTTCATCAGAAGTCAATTCGATTAGAAGCTGAAGAAGAAGTCGCCAATTCACAAAAAGCAAAACCCTTTTGTACTCAATTCTAATGGTGATTGAAAGACAAGTTATCAATTCTCCTCAGAGTTGGCGATGGAGCCATCAATTCTCTCGGGGATGTGTTGGGGGGCGCGCAATTCGCTTATCAAATCATTACCAGGTCAGCAATTCTATGGAGATAATTCGTGGGATAAGGCAATTCTCATAGAAACAGAGAATGGATAACTCAATTCTAAACTCAACCACCTAAAAC GAACGCAATTCGTTCTCCCACATGATGTTGCGCGACCAATTCCGTGTTTGTTTCTTGCAGTAGCCATCAATTCGTCTTACATGACTTCCATAGTCAGACAATTCGGGGCATTCCAAAAACTTGGGTTGACAATTCATCAATTCATAATATAATAGTGACGCAATTCCATTTTCCTTCCTCATTCATTAAGGCAATTCCAGAGATTTGTAGACTCTAGGTCTTCAATTCAGCTCACATTTCTTCGTTTTTAAAACAATTCAGTAAAAAAATACAAGTAGTGTCAGCAATTCTCTTCCATAAGAAGTTGTGAGGTTACAATTCTGGGAACTATAAAAATCAGATGACGCAATTCTTTTGTACATGGGATAAACAGTTCCCAATTCTACAATGGAGCTCTTATATGGTTCCCAATTCAACTAAAGATAAAGGAGTTCGCGCGCAATTCTAAAACCAATCAATTAAATAGAACGCAATTCCTACATAGATGGTGGGACCCGTCTTCAATTCAAATAAGCTAATACCAAGTAGTACTCAATTCCTCTCTATAAACCATTACAAGGTGTCAATTCTGATGCTCGTCCTCGTCATAGGTACCAATTCCAAATCAAATAAAATCCTATTATTACAATTCTTGAACGTGTCGTCTCACACGTCTTCAATTCGTGCGTTTCTATCAAAACTGTAACTCAATTCAGACTCTGAATAAAGTTTACTATGTCAATTCGGATTCACCTATAGCTTGATGAATACAATTCAGAACCTTCTCTTCCTCAAAGTCTTCAATTCATGGAGACTTCAAATTTGAAGTTCCCAATTCTGGTTTTGTGTTTACAGTTAGTTGACAATTCAGTATTACTAGAGCTGCGTATAACTCAATTCTCATCAGAAAAAATATATGATACGCCAATTCCAAAAGATTGTAGAGTATTTGCCATCAATTCCAGTTTTGGGTTCTTCACCAGAAGTCAATTCAATTCCTTCAATCATCACACGGTCGCAATTCAGTGTAAAAAGAAATTTCTAGCGGTCAATTCATGGTATGATGATAATAAGTTAACTCAATTCGAGTTCAGTTTGTTACTACTGCGGTCAATTCGGCAGAAGACACCAGTAACATAAGGCAATTCATAGGTATAGGAAGAAGCCCTAACTCAATTCGTTCATACTTACACATGCATTATTACAATTCTATATGATTTGTTACAGGTTGAATACAATTCAAAACCCTAATCACACGCACTGAACCAATTCCACAATCGATGCTTATGAAATATTACAATTCCAGTGGTCTAACAAGTAAACGTTGACAATTCTTCATTCGAGTATTTCATCAGAAGTCAATTCGATTAGAAGCTGAAGAAGAAGTCGCCAATTCACAAAAAGCAAAACCCTTTTGTACTCAATTCTAATGGTGATTGAAAGACAAGTTATCAATTCTCCTCAGAGTTGGCGATGGAGCCATCAATTCTCTCGGGGATGTGTTGGGGGGCGCGCAATTCGCTTATCAAATCATTACCAGGTCAGCAATTCTATGGAGATAATTCGTGGGATAAGGCAATTCTCATAGAAACAGAGAATGGATAACTCAATTCTAAACTCAACCACCTAAAACGTCGCCAATTCTACAGCCAGGTTTTGGATCTTACCACAATTCCCGGGGGCAAATTACGTTGAGGTACCAATTCCGTAGGCGACTTGAGTGCGGGTTATCAATTCAACAGTCATTTCATTGGACCGGTACCAATTCATTATTTTCATATAAATTTTGTTATCAATTCTTCGCTTTCGCCACTTGGTCTGACGCAATTCTGATCCATTGATTGCTCTTGGTTCCCAATTCTTGTTTAGGCAGTTCATACCGTTGACAATTCTCAGTATGCTAGGTGGTTGAGGTACCAATTCAGTTCAAGAGCCCAAGGACTGGTACCAATTCATCGTGAGAGAATGAGTAAAGCCATCAATTCGGTTCAGTATTTCCTTTCGGGGTGTCAATTCGCAAGGATTTGTAGGCCGGATATTACAATTCTGGGTTTTTCCTTCTGGTGAGAATACAATTCTGAAGTCCTACGAAATATAGTATGTCAATTCTCCCAAAATGTGAGAGGTCCGTCAGCAATTCATTTTCATTCTGACCGAACCTATGTCAATTCCATATTCGAAGTTGCGATCAGAATACAATTCAAAGTTGTAAGTAATATCTCTATTACAATTCCAATAGAGAAAAGAGTCGTAGTCAGCAATTCGCCCTATAGTGAGTCGTATTGTCAGCAATTCCATTTCCGGCGTGATGATGCGAACGCAATTCGTTCTCCCACATGATGTTGCGCGACCAATTCCGTGTTTGTTTCTTGCAGTAGCCATCAATTCGTCTTACATGACTTCCATAGTCAGACAATTCGGGGCATTCCAAAAACTTGGGTTGACAATTCATCAATTCATAATATAATAGTGACGCAATTCCATTTTCCTTCCTCATTCATTAAGGCAATTCCAGAGATTTGTAGACTCTAGGTCTTCAATTCAGCTCACATTTCTTCGTTTTTAAAACAATTCAGTAAAAAAATACAAGTAGTGTCAGCAATTCTCTTCCATAAGAAGTTGTGAGGTTACAATTCTGGGAACTATAAAAATCAGATGACGCAATTCTTTTGTACATGGGATAAACAGTTCCCAATTCTACAATGGAGCTCTTATATGGTTCCCAATTCAACTAAAGATAAAGGAGTTCGCGCGCAATTCTAAAACCAATCAATTAAATAGAACGCAATTCCTACATAGATGGTGGGACCCGTCTTCAATTCAAATAAGCTAATACCAAGTAGTACTCAATTCCTCTCTATAAACCATTACAAGGTGTCAATTCTGATGCTCGTCCTCGTCATAGGTACCAATTCCAAATCAAATAAAATCCTATTATTACAATTCTTGAACGTGTCGTCTCACACGTCTTCAATTCGTGCGTTTCTATCAAAACTGTAACTCAATTCAGACTCTGAATAAAGTTTACTATGTCAATTCGGATTCACCTATAGCTTGATGAATACAATTCAGAACCTTCTCTTCCTCAAAGTCTTCAATTCATGGAGACTTCAAATTTGAAGTTCCCAATTCTGGTTTTGTGTTTACAGTTAGTTGACAATTCAGTATTACTAGAGCTGCGTATAACTCAATTCTCATCAGAAAAAATATATGATACGCCAATTCCAAAAGATTGTAGAGTATTTGCCATCAATTCCAGTTTTGGGTTCTTCACCAGAAGTCAATTCAATTCCTTCAATCATCACACGGTCGCAATTCAGTGTAAAAAGAAATTTCTAGCGGTCAATTCATGGTATGATGATAATAAGTTAACTCAATTCGAGTTCAGTTTGTTACTACTGCGGTCAATTCGGCAGAAGACACCAGTAACATAAGGCAATTCATAGGTATAGGAAGAAGCCCTAACTCAATTCGTTCATACTTACACATGCATTATTACAATTCTATATGATTTGTTACAGGTTGAATACAATTCAAAACCCTAATCACACGCACTGAACCAATTCCACAATCGATGCTTATGAAATATTACAATTCCAGTGGTCTAACAAGTAAACGTTGACAATTCTTCATTCGAGTATTTCATCAGAAGTCAATTCGATTAGAAGCTGAAGAAGAAGTCGCCAATTCACAAAAAGCAAAACCCTTTTGTACTCAATTCTAATGGTGATTGAAAGACAAGTTATCAATTCTCCTCAGAGTTGGCGATGGAGCCATCAATTCTCTCGGGGATGTGTTGGGGGGCGCGCAATTCGCTTATCAAATCATTACCAGGTCAGCAATTCTATGGAGATAATTCGTGGGATAAGGCAATTCTCATAGAAACAGAGAATGGATAACTCAATTCTAAACTCAACCACCTAAAACGTCGCCAATTCTACAGCCAGGTTTTGGATCTTACCACAATTCCCGGGGGCAAATTACGTTGAGGTACCAATTCCGTAGGCGACTTGAGTGCGGGTTATCAATTCAACAGTCATTTCATTGGACCGGTACCAATTCATTATTTTCATATAAATTTTGTTATCAATTCTTCGCTTTCGCCACTTGGTCTGACGCAATTCTGATCCATTGATTGCTCTTGGTTCCCAATTCTTGTTTAGGCAGTTCATACCGTTGACAATTCTCAGTATGCTAGGTGGTTGAGGTACCAATTCAGTTCAAGAGCCCAAGGACTGGTACCAATTCATCGTGAGAGAATGAGTAAAGCCATCAATTCGGTTCAGTATTTCCTTTCGGGGTGTCAATTCGCAAGGATTTGTAGGCCGGATATTACAATTCTGGGTTTTTCCTTCTGGTGAGAATACAATTCTGAAGTCCTACGAAATATAGTATGTCAATTCTCCCAAAATGTGAGAGGTCCGTCAGCAATTCATTTTCATTCTGACCGAACCTATGTCAATTCCATATTCGAAGTTGCGATCAGAATACAATTCAAAGTTGTAAGTAATATCTCTATTACAATTCCAATAGAGAAAAGAGTCGTAGTCAGCAATTCGCCCTATAGTGAGTCGTATTGTCAGCAATTCCATTTCCGGCGTGATGATGCGAACGCAATTCGTTCTCCCACATGATGTTGCGCGACCAATTCCGTGTTTGTTTCTTGCAGTAGCCATCAATTCGTCTTACATGACTTCCATAGTCAGACAATTCGGGGCATTCCAAAAACTTGGGTTGACAATTCATCAATTCATAATATAATAGTGACGCAATTCCATTTTCCTTCCTCATTCATTAAGGCAATTCCAGAGATTTGTAGACTCTAGGTCTTCAATTCAGCTCACATTTCTTCGTTTTTAAAACAATTCAGTAAAAAAATACAAGTAGTGTCAGCAATTCTCTTCCATAAGAAGTTGTGAGGTTACAATTCTGGGAACTATAAAAATCAGATGACGCAATTCTTTTGTACATGGGATAAACAGTTCCCAATTCTACAATGGAGCTCTTATATGGTTCCCAATTCAACTAAAGATAAAGGAGTTCGCGCGCAATTCTAAAACCAATCAATTAAATAGAACGCAATTCCTACATAGATGGTGGGACCCGTCTTCAATTCAAATAAGCTAATACCAAGTAGTACTCAATTCCTCTCTATAAACCATTACAAGGTGTCAATTCTGATGCTCGTCCTCGTCATAGGTACCAATTCCAAATCAAATAAAATCCTATTATTACAATTCTTGAACGTGTCGTCTCACACGTCTTCAATTCGTGCGTTTCTATCAAAACTGTAACTCAATTCAGACTCTGAATAAAGTTTACTATGTCAATTCGGATTCACCTATAGCTTGATGAATACAATTCAGAACCTTCTCTTCCTCAAAGTCTTCAATTCATGGAGACTTCAAATTTGAAGTTCCCAATTCTGGTTTTGTGTTTACAGTTAGTTGACAATTCAGTATTACTAGAGCTGCGTATAACTCAATTCTCATCAGAAAAAATATATGATACGCCAATTCCAAAAGATTGTAGAGTATTTGCCATCAATTCCAGTTTTGGGTTCTTCACCAGAAGTCAATTCAATTCCTTCAATCATCACACGGTCGCAATTCAGTGTAAAAAGAAATTTCTAGCGGTCAATTCATGGTATGATGATAATAAGTTAACTCAATTCGAGTTCAGTTTGTTACTACTGCGGTCAATTCGGCAGAAGACACCAGTAACATAAGGCAATTCATAGGTATAGGAAGAAGCCCTAACTCAATTCGTTCATACTTACACATGCATTATTACAATTCTATATGATTTGTTACAGGTTGAATACAATTCAAAACCCTAATCACACGCACTGAACCAATTCCACAATCGATGCTTATGAAATATTACAATTCCAGTGGTCTAACAAGTAAACGTTGACAATTCTTCATTCGAGTATTTCATCAGAAGTCAATTCGATTAGAAGCTGAAGAAGAAGTCGCCAATTCACAAAAAGCAAAACCCTTTTGTACTCAATTCTAATGGTGATTGAAAGACAAGTTATCAATTCTCCTCAGAGTTGGCGATGGAGCCATCAATTCTCTCGGGGATGTGTTGGGGGGCGCGCAATTCGCTTATCAAATCATTACCAGGTCAGCAATTCTATGGAGATAATTCGTGGGATAAGGCAATTCTCATAGAAACAGAGAATGGATAACTCAATTCTAAACTCAACCACCTAAAACGTCGCCAATTCTACAGCCAGGTTTTGGATCTTACCACAATTCCCGGGGGCAAATTACGTTGAGGTACCAATTCCGTAGGCGACTTGAGTGCGGGTTATCAATTCAACAGTCATTTCATTGGACCGGTACCAATTCATTATTTTCATATAAATTTTGTTATCAATTCTTCGCTTTCGCCACTTGGTCTGACGCAATTCTGATCCATTGATTGCTCTTGGTTCCCAATTCTTGTTTAGGCAGTTCATACCGTTGACAATTCTCAGTATGCTAGGTGGTTGAGGTACCAATTCAGTTCAAGAGCCCAAGGACTGGTACCAATTCATCGTGAGAGAATGAGTAAAGCCATCAATTCGGTTCAGTATTTCCTTTCGGGGTGTCAATTCGCAAGGATTTGTAGGCCGGATATTACAATTCTGGGTTTTTCCTTCTGGTGAGAATACAATTCTGAAGTCCTACGAAATATAGTATGTCAATTCTCCCAAAATGTGAGAGGTCCGTCAGCAATTCATTTTCATTCTGACCGAACCTATGTCAATTCCATATTCGAAGTTGCGATCAGAATACAATTCAAAGTTGTAAGTAATATCTCTATTACAATTCCAATAGAGAAAAGAGTCGTAGTCAGCAATTCGCCCTATAGTGAGTCGTATTGTCAGCAATTCCATTTCCGGCGTGATGATGCGAACGCAATTCGTTCTCCCACATGATGTTGCGCGACCAATTCCGTGTTTGTTTCTTGCAGTAGCCATCAATTCGTCTTACATGACTTCCATAGTCAGACAATTCGGGGCATTCCAAAAACTTGGGTTGACAATTCATCAATTCATAATATAATAGTGACGCAATTCCATTTTCCTTCCTCATTCATTAAGGCAATTCCAGAGATTTGTAGACTCTAGGTCTTCAATTCAGCTCACATTTCTTCGTTTTTAAAACAATTCAGTAAAAAAATACAAGTAGTGTCAGCAATTCTCTTCCATAAGAAGTTGTGAGGTTACAATTCTGGGAACTATAAAAATCAGATGACGCAATTCTTTTGTACATGGGATAAACAGTTCCCAATTCTACAATGGAGCTCTTATATGGTTCCCAATTCAACTAAAGATAAAGGAGTTCGCGCGCAATTCTAAAACCAATCAATTAAATAGAACGCAATTCCTACATAGATGGTGGGACCCGTCTTCAATTCAAATAAGCTAATACCAAGTAGTACTCAATTCCTCTCTATAAACCATTACAAGGTGTCAATTCTGATGCTCGTCCTCGTCATAGGTACCAATTCCAAATCAAATAAAATCCTATTATTACAATTCTTGAACGTGTCGTCTCACACGTCTTCAATTCGTGCGTTTCTATCAAAACTGTAACTCAATTCAGACTCTGAATAAAGTTTACTATGTCAATTCGGATTCACCTATAGCTTGATGAATACAATTCAGAACCTTCTCTTCCTCAAAGTCTTCAATTCATGGAGACTTCAAATTTGAAGTTCCCAATTCTGGTTTTGTGTTTACAGTTAGTTGACAATTCAGTATTACTAGAGCTGCGTATAACTCAATTCTCATCAGAAAAAATATATGATACGCCAATTCCAAAAGATTGTAGAGTATTTGCCATCAATTCCAGTTTTGGGTTCTTCACCAGAAGTCAATTCAATTCCTTCAATCATCACACGGTCGCAATTCAGTGTAAAAAGAAATTTCTAGCGGTCAATTCATGGTATGATGATAATAAGTTAACTCAATTCGAGTTCAGTTTGTTACTACTGCGGTCAATTCGGCAGAAGACACCAGTAACATAAGGCAATTCATAGGTATAGGAAGAAGCCCTAACTCAATTCGTTCATACTTACACATGCATTATTACAATTCTATATGATTTGTTACAGGTTGAATACAATTCAAAACCCTAATCACACGCACTGAACCAATTCCACAATCGATGCTTATGAAATATTACAATTCCAGTGGTCTAACAAGTAAACGTTGACAATTCTTCATTCGAGTATTTCATCAGAAGTCAATTCGATTAGAAGCTGAAGAAGAAGTCGCCAATTCACAAAAAGCAAAACCCTTTTGTACTCAATTCTAATGGTGATTGAAAGACAAGTTATCAATTCTCCTCAGAGTTGGCGATGGAGCCATCAATTCTCTCGGGGATGTGTTGGGGGGCGCGCAATTCGCTTATCAAATCATTACCAGGTCAGCAATTCTATGGAGATAATTCGTGGGATAAGGCAATTCTCATAGAAACAGAGAATGGATAACTCAATTCTAAACTCAACCACCTAAAACGTCGCCAATTCTACAGCCAGGTTTTGGATCTTACCACAATTCCCGGGGGCAAATTACGTTGAGGTACCAATTCCGTAGGCGACTTGAGTGCGGGTTATCAATTCAACAGTCATTTCATTGGACCGGTACCAATTCATTATTTTCATATAAATTTTGTTATCAATTCTTCGCTTTCGCCACTTGGTCTGACGCAATTCTGATCCATTGATTGCTCTTGGTTCCCAATTCTTGTTTAGGCAGTTCATACCGTTGACAATTCTCAGTATGCTAGGTGGTTGAGGTACCAATTCAGTTCAAGAGCCCAAGGACTGGTACCAATTCATCGTGAGAGAATGAGTAAAGCCATCAATTCGGTTCAGTATTTCCTTTCGGGGTGTCAATTCGCAAGGATTTGTAGGCCGGATATTACAATTCTGGGTTTTTCCTTCTGGTGAGAATACAATTCTGAAGTCCTACGAAATATAGTATGTCAATTCTCCCAAAATGTGAGAGGTCCGTCAGCAATTCATTTTCATTCTGACCGAACCTATGTCAATTCCATATTCGAAGTTGCGATCAGAATACAATTCAAAGTTGTAAGTAATATCTCTATTACAATTCCAATAGAGAAAAGAGTCGTAGTCAGCAATTCGCCCTATAGTGAGTCGTATTGTCAGCAATTCCATTTCCGGCGTGATGATGCGAACGCAATTCGTTCTCCCACATGATGTTGCGCGACCAATTCCGTGTTTGTTTCTTGCAGTAGCCATCAATTCGTCTTACATGACTTCCATAGTCAGACAATTCGGGGCATTCCAAAAACTTGGGTTGACAATTCATCAATTCATAATATAATAGTGACGCAATTCCATTTTCCTTCCTCATTCATTAAGGCAATTCCAGAGATTTGTAGACTCTAGGTCTTCAATTCAGCTCACATTTCTTCGTTTTTAAAACAATTCAGTAAAAAAATACAAGTAGTGTCAGCAATTCTCTTCCATAAGAAGTTGTGAGGTTACAATTCTGGGAACTATAAAAATCAGATGACGCAATTCTTTTGTACATGGGATAAACAGTTCCCAATTCTACAATGGAGCTCTTATATGGTTCCCAATTCAACTAAAGATAAAGGAGTTCGCGCGCAATTCTAAAACCAATCAATTAAATAGAACGCAATTCCTACATAGATGGTGGGACCCGTCTTCAATTCAAATAAGCTAATACCAAGTAGTACTCAATTCCTCTCTATAAACCATTACAAGGTGTCAATTCTGATGCTCGTCCTCGTCATAGGTACCAATTCCAAATCAAATAAAATCCTATTATTACAATTCTTGAACGTGTCGTCTCACACGTCTTCAATTCGTGCGTTTCTATCAAAACTGTAACTCAATTCAGACTCTGAATAAAGTTTACTATGTCAATTCGGATTCACCTATAGCTTGATGAATACAATTCAGAACCTTCTCTTCCTCAAAGTCTTCAATTCATGGAGACTTCAAATTTGAAGTTCCCAATTCTGGTTTTGTGTTTACAGTTAGTTGACAATTCAGTATTACTAGAGCTGCGTATAACTCAATTCTCATCAGAAAAAATATATGATACGCCAATTCCAAAAGATTGTAGAGTATTTGCCATCAATTCCAGTTTTGGGTTCTTCACCAGAAGTCAATTCAATTCCTTCAATCATCACACGGTCGCAATTCAGTGTAAAAAGAAATTTCTAGCGGTCAATTCATGGTATGATGATAATAAGTTAACTCAATTCGAGTTCAGTTTGTTACTACTGCGGTCAATTCGGCAGAAGACACCAGTAACATAAGGCAATTCATAGGTATAGGAAGAAGCCCTAACTCAATTCGTTCATACTTACACATGCATTATTACAATTCTATATGATTTGTTACAGGTTGAATACAATTCAAAACCCTAATCACACGCACTGAACCAATTCCACAATCGATGCTTATGAAATATTACAATTCCAGTGGTCTAACAAGTAAACGTTGACAATTCTTCATTCGAGTATTTCATCAGAAGTCAATTCGATTAGAAGCTGAAGAAGAAGTCGCCAATTCACAAAAAGCAAAACCCTTTTGTACTCAATTCTAATGGTGATTGAAAGACAAGTTATCAATTCTCCTCAGAGTTGGCGATGGAGCCATCAATTCTCTCGGGGATGTGTTGGGGGGCGCGCAATTCGCTTATCAAATCATTACCAGGTCAGCAATTCTATGGAGATAATTCGTGGGATAAGGCAATTCTCATAGAAACAGAGAATGGATAACTCAATTCTAAACTCAACCACCTAAAACGTCGCCAATTCTACAGCCAGGTTTTGGATCTTACCACAATTCCCGGGGGCAAATTACGTTGAGGTACCAATTCCGTAGGCGACTTGAGTGCGGGTTATCAATTCAACAGTCATTTCATTGGACCGGTACCAATTCATTATTTTCATATAAATTTTGTTATCAATTCTTCGCTTTCGCCACTTGGTCTGACGCAATTCTGATCCATTGATTGCTCTTGGTTCCCAATTCTTGTTTAGGCAGTTCATACCGTTGACAATTCTCAGTATGCTAGGTGGTTGAGGTACCAATTCAGTTCAAGAGCCCAAGGACTGGTACCAATTCATCGTGAGAGAATGAGTAAAGCCATCAATTCGGTTCAGTATTTCCTTTCGGGGTGTCAATTCGCAAGGATTTGTAGGCCGGATATTACAATTCTGGGTTTTTCCTTCTGGTGAGAATACAATTCTGAAGTCCTACGAAATATAGTATGTCAATTCTCCCAAAATGTGAGAGGTCCGTCAGCAATTCATTTTCATTCTGACCGAACCTATGTCAATTCCATATTCGAAGTTGCGATCAGAATACAATTCAAAGTTGTAAGTAATATCTCTATTACAATTCCAATAGAGAAAAGAGTCGTAGTCAGCAATTCGCCCTATAGTGAGTCGTATTGTCAGCAATTCCATTTCCGGCGTGATGATGCGAACGCAATTCGTTCTCCCACATGATGTTGCGCGACCAATTCCGTGTTTGTTTCTTGCAGTAGCCATCAATTCGTCTTACATGACTTCCATAGTCAGACAATTCGGGGCATTCCAAAAACTTGGGTTGACAATTCATCAATTCATAATATAATAGTGACGCAATTCCATTTTCCTTCCTCATTCATTAAGGCAATTCCAGAGATTTGTAGACTCTAGGTCTTCAATTCAGCTCACATTTCTTCGTTTTTAAAACAATTCAGTAAAAAAATACAAGTAGTGTCAGCAATTCTCTTCCATAAGAAGTTGTGAGGTTACAATTCTGGGAACTATAAAAATCAGATGACGCAATTCTTTTGTACATGGGATAAACAGTTCCCAATTCTACAATGGAGCTCTTATATGGTTCCCAATTCAACTAAAGATAAAGGAGTTCGCGCGCAATTCTAAAACCAATCAATTAAATAGAACGCAATTCCTACATAGATGGTGGGACCCGTCTTCAATTCAAATAAGCTAATACCAAGTAGTACTCAATTCCTCTCTATAAACCATTACAAGGTGTCAATTCTGATGCTCGTCCTCGTCATAGGTACCAATTCCAAATCAAATAAAATCCTATTATTACAATTCTTGAACGTGTCGTCTCACACGTCTTCAATTCGTGCGTTTCTATCAAAACTGTAACTCAATTCAGACTCTGAATAAAGTTTACTATGTCAATTCGGATTCACCTATAGCTTGATGAATACAATTCAGAACCTTCTCTTCCTCAAAGTCTTCAATTCATGGAGACTTCAAATTTGAAGTTCCCAATTCTGGTTTTGTGTTTACAGTTAGTTGACAATTCAGTATTACTAGAGCTGCGTATAACTCAATTCTCATCAGAAAAAATATATGATACGCCAATTCCAAAAGATTGTAGAGTATTTGCCATCAATTCCAGTTTTGGGTTCTTCACCAGAAGTCAATTCAATTCCTTCAATCATCACACGGTCGCAATTCAGTGTAAAAAGAAATTTCTAGCGGTCAATTCATGGTATGATGATAATAAGTTAACTCAATTCGAGTTCAGTTTGTTACTACTGCGGTCAATTCGGCAGAAGACACCAGTAACATAAGGCAATTCATAGGTATAGGAAGAAGCCCTAACTCAATTCGTTCATACTTACACATGCATTATTACAATTCTATATGATTTGTTACAGGTTGAATACAATTCAAAACCCTAATCACACGCACTGAACCAATTCCACAATCGATGCTTATGAAATATTACAATTCCAGTGGTCTAACAAGTAAACGTTGACAATTCTTCATTCGAGTATTTCATCAGAAGTCAATTCGATTAGAAGCTGAAGAAGAAGTCGCCAATTCACAAAAAGCAAAACCCTTTTGTACTCAATTCTAATGGTGATTGAAAGACAAGTTATCAATTCTCCTCAGAGTTGGCGATGGAGCCATCAATTCTCTCGGGGATGTGTTGGGGGGCGCGCAATTCGCTTATCAAATCATTACCAGGTCAGCAATTCTATGGAGATAATTCGTGGGATAAGGCAATTCTCATAGAAACAGAGAATGGATAACTCAATTCTAAACTCAACCACCTAAAACGTCGCCAATTCTACAGCCAGGTTTTGGATCTTACCACAATTCCCGGGGGCAAATTACGTTGAGGTACCAATTCCGTAGGCGACTTGAGTGCGGGTTATCAATTCAACAGTCATTTCATTGGACCGGTACCAATTCATTATTTTCATATAAATTTTGTTATCAATTCTTCGCTTTCGCCACTTGGTCTGACGCAATTCTGATCCATTGATTGCTCTTGGTTCCCAATTCTTGTTTAGGCAGTTCATACCGTTGACAATTCTCAGTATGCTAGGTGGTTGAGGTACCAATTCAGTTCAAGAGCCCAAGGACTGGTACCAATTCATCGTGAGAGAATGAGTAAAGCCATCAATTCGGTTCAGTATTTCCTTTCGGGGTGTCAATTCGCAAGGATTTGTAGGCCGGATATTACAATTCTGGGTTTTTCCTTCTGGTGAGAATACAATTCTGAAGTCCTACGAAATATAGTATGTCAATTCTCCCAAAATGTGAGAGGTCCGTCAGCAATTCATTTTCATTCTGACCGAACCTATGTCAATTCCATATTCGAAGTTGCGATCAGAATACAATTCAAAGTTGTAAGTAATATCTCTATTACAATTCCAATAGAGAAAAGAGTCGTAGTCAGCAATTCGCCCTATAGTGAGTCGTATTGTCAGCAATTCCATTTCCGGCGTGATGATGCGAACGCAATTCGTTCTCCCACATGATGTTGCGCGACCAATTCCGTGTTTGTTTCTTGCAGTAGCCATCAATTCGTCTTACATGACTTCCATAGTCAGACAATTCGGGGCATTCCAAAAACTTGGGTTGACAATTCATCAATTCATAATATAATAGTGACGCAATTCCATTTTCCTTCCTCATTCATTAAGGCAATTCCAGAGATTTGTAGACTCTAGGTCTTCAATTCAGCTCACATTTCTTCGTTTTTAAAACAATTCAGTAAAAAAATACAAGTAGTGTCAGCAATTCTCTTCCATAAGAAGTTGTGAGGTTACAATTCTGGGAACTATAAAAATCAGATGACGCAATTCTTTTGTACATGGGATAAACAGTTCCCAATTCTACAATGGAGCTCTTATATGGTTCCCAATTCAACTAAAGATAAAGGAGTTCGCGCGCAATTCTAAAACCAATCAATTAAATAGAACGCAATTCCTACATAGATGGTGGGACCCGTCTTCAATTCAAATAAGCTAATACCAAGTAGTACTCAATTCCTCTCTATAAACCATTACAAGGTGTCAATTCTGATGCTCGTCCTCGTCATAGGTACCAATTCCAAATCAAATAAAATCCTATTATTACAATTCTTGAACGTGTCGTCTCACACGTCTTCAATTCGTGCGTTTCTATCAAAACTGTAACTCAATTCAGACTCTGAATAAAGTTTACTATGTCAATTCGGATTCACCTATAGCTTGATGAATACAATTCAGAACCTTCTCTTCCTCAAAGTCTTCAATTCATGGAGACTTCAAATTTGAAGTTCCCAATTCTGGTTTTGTGTTTACAGTTAGTTGACAATTCAGTATTACTAGAGCTGCGTATAACTCAATTCTCATCAGAAAAAATATATGATACGCCAATTCCAAAAGATTGTAGAGTATTTGCCATCAATTCCAGTTTTGGGTTCTTCACCAGAAGTCAATTCAATTCCTTCAATCATCACACGGTCGCAATTCAGTGTAAAAAGAAATTTCTAGCGGTCAATTCATGGTATGATGATAATAAGTTAACTCAATTCGAGTTCAGTTTGTTACTACTGCGGTCAATTCGGCAGAAGACACCAGTAACATAAGGCAATTCATAGGTATAGGAAGAAGCCCTAACTCAATTCGTTCATACTTACACATGCATTATTACAATTCTATATGATTTGTTACAGGTTGAATACAATTCAAAACCCTAATCACACGCACTGAACCAATTCCACAATCGATGCTTATGAAATATTACAATTCCAGTGGTCTAACAAGTAAACGTTGACAATTCTTCATTCGAGTATTTCATCAGAAGTCAATTCGATTAGAAGCTGAAGAAGAAGTCGCCAATTCACAAAAAGCAAAACCCTTTTGTACTCAATTCTAATGGTGATTGAAAGACAAGTTATCAATTCTCCTCAGAGTTGGCGATGGAGCCATCAATTCTCTCGGGGATGTGTTGGGGGGCGCGCAATTCGCTTATCAAATCATTACCAGGTCAGCAATTCTATGGAGATAATTCGTGGGATAAGGCAATTCTCATAGAAACAGAGAATGGATAACTCAATTCTAAACTCAACCACCTAAAACGTCGCCAATTCTACAGCCAGGTTTTGGATCTTACCACAATTCCCGGGGGCAAATTACGTTGAGGTACCAATTCCGTAGGCGACTTGAGTGCGGGTTATCAATTCAACAGTCATTTCATTGGACCGGTACCAATTCATTATTTTCATATAAATTTTGTTATCAATTCTTCGCTTTCGCCACTTGGTCTGACGCAATTCTGATCCATTGATTGCTCTTGGTTCCCAATTCTTGTTTAGGCAGTTCATACCGTTGACAATTCTCAGTATGCTAGGTGGTTGAGGTACCAATTCAGTTCAAGAGCCCAAGGACTGGTACCAATTCATCGTGAGAGAATGAGTAAAGCCATCAATTCGGTTCAGTATTTCCTTTCGGGGTGTCAATTCGCAAGGATTTGTAGGCCGGATATTACAATTCTGGGTTTTTCCTTCTGGTGAGAATACAATTCTGAAGTCCTACGAAATATAGTATGTCAATTCTCCCAAAATGTGAGAGGTCCGTCAGCAATTCATTTTCATTCTGACCGAACCTATGTCAATTCCATATTCGAAGTTGCGATCAGAATACAATTCAAAGTTGTAAGTAATATCTCTATTACAATTCCAATAGAGAAAAGAGTCGTAGTCAGCAATTCGCCCTATAGTGAGTCGTATTGTCAGCAATTCCATTTCCGGCGTGATGATGC

Whole Genome Profiling

Illumina sequence reads:TCTGT CAATTC TAGTACCAAGCTTGCCATGATAAGG CAATTC GTTCCCGGGCCTTGTACACAGTCGC CAATTC CATCCAATAAATAGCTCTATGCATC CAATTC TAGTACCAAGCTTGCCATGATATTA CAATTC AATTAGAAGAAATGATATTC

Whole Genome Profiling Sequence Tags

sample identification tag (“barcode”)

Restriction site part of the primer

20 base genome sequence tag flanking RE site

= pool R3

= pool C19

70% of sequence 20-mer tags are unique in rice; > 85% in Arabidopsis

Fraction unique Tags

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

0 5 10 15 20 25 30tag length (incl RE-site) bp

EcoMse-At

PstMse-At

EcoMse-Os

PstMse-Os

Impact sequence tag lengthWhole Genome Profiling

FingerPrinted Contigs (FPC) assembly

BAC1

BAC2

Assembly physical BAC map using adapted FPCWhole Genome Profiling

Whole Genome Profiling Whole Genome Arabidopsis

Arabidopsis Genome – 125 Mbp 6144 BACs (6 GE) in 384 well plates Each Illumina GA lane:

• 768 BACs ~ 3 M reads Total 8 lanes

Whole Genome ProfilingResults 6 GE Arabidopsis

4599 BACs 65,000 tags

234 contigs (2 – 125 BACs)541 singletons

85% coverage

FPC

BAC1BAC2

WGP Arabidopsis thaliana ecotype Colombia

6144 BACs (5 GE); WGP using one Illumina GA classic run 65,000 sequence tags assembly 4599 BACs (75%): 234 contigs (2 – 125 BACs/contig)

Validation on genome sequence by BLAST analysis WGP sequence tags:

52,000 tags 100% hits, covering 99% of genome; max. gap 125 kbp 50,000 unique hits; average 2,355 bp between tags 86% of all EcoRI sites represented

PoP Arabidopsis thalianaWhole Genome Profiling

XXXXXXXXXXXXXXXXXXXXX X XX X X

XX X XX X XX X XX X XX X XX X XX X XX X XX X X

X XX X

X X X XX X X XX X X XX X X XX X X XX X X XX X X X XX X X X X

X X X X XX X X X XX X X X XX X X XX X X X XX X X X XX X X XX X X XX X X XX X X XX X XX X X XX X XX X XX X XX X XX X XX X X XX X X XX X X XX X X XX X X XX X X XX X X XX X X XX X X X

XXX X XX X XX X XX X XX XX X XX X XX X XX XX X XX X X

XX XX XX XX XX XX XX XX X XX X XX X XX X X

XX X X

X X XX X XXX X X

X XX X XX X XX X X

XX X X XX X X XX X X XX X X XX X X XX X X XX X X XX X X XX X X XX X X X

X X XX X X X XX X X X X XX X X X XX X X X X X

X XX X X X X XX X X X X X XX X X X X X XX X X X X X XX X X X X X X

X X X X X XX X X X X X XX X X X X X XX X X X X X X

X XX X X X XX X X X X XX X X X X X

X XX X X X X X XX

X X XX X X X X X XX X X X X X XX X X XX X X X X X XX X X X X X XX X X X X X XX XX X X X X X XX X X X X X XX X X X X XX X X X X X X XX X X X X X XX XX X X X X X X XXX X X X X X X XX X X X X X X X XX X X X X X X X XX X X X X X X X XX X X X X X X X XX X X X XX X X X X X X X XX X X X X X XX X X X X X X X X

XX X X X X X X X XX X X X X X X X XX X X X X X X X XX X X X X X X X XX X X X X X X XX X X X X X X X XX X X X X X X X XX X X X X X X X XX X X X X X X X XX X X X X X X X XX X X X X X X X XX X X XX X X X X X X X XX X X X X X X X XX X X X X X XX X X X X X X X XX X X X X X X XX X X X X X X X XX X X X

X X X XX X X X X X X

X X X X X XX X X X X X X

X X X X X XX X X X X X

X X X XX X X X X X XX X X X X X X

XX

X X X X X X XX X X X X XX X X X X X XX X X X X X XX X X X X X XX X X X X X XX X X X X X XX X X X X X X

X X X X XX X X X XX X X X XX X X X XX X X X XX X X X X

XX X X X XX X X X XX X X XX X X X

X X XX X

XX XX XX XX XX X

XXXXXXXX

X XXXXXXXXXX

BAC852

BAC4124

BAC1373

BAC285

BAC2544

BAC704

BAC3536

BAC2070

BAC4237

BAC5328

BAC3912

BAC1461

Sequence Chrom bpX X X X X GAATTCTCAGTCACCGTCGGCGTTTG 3 17059264X X X X X X GAATTCCACCAGCTACGATACCAACT 3 17059284X X X X X GAATTCGAGAAACTCGGAGGAATCGA 3 17064605X X X X X X GAATTCAGACTCTGGTAAGCTTTCTT 3 17064625

X X GAATTCTCTGTCTTCTATTCTTGCTG 3 17070747X X X X X X GAATTCAAGAGTACCTTTCAAGGGAG 3 17070767X X X X X X X GAATTCCAGTGTATCCATTAGGCCCT 3 17075442X X X X X X X GAATTCCAAGTTCTTGTTGCAGCCAT 3 17075462X X X X X X X GAATTCAATCGAGTAAACTCTTCGCA 3 17078725X X X X X X X GAATTCTCCCTGAGGAACTATAATTG 3 17078745

X X X X X X GAATTCAGAAGAACCCTAGACTAAAT 3 17100591X X X X X X X GAATTCAATGCATTTTTGATTTTCCA 3 17100611X X X X X X X GAATTCTATCCCTAAGTGCTACAACA 3 17103098X X X X X X X GAATTCCATAAAGTTCTCGGATCACA 3 17103118

X X GAATTCGCTAGTTTTAAGATCATTAT 3 17107409X X X X X GAATTCGGATTTAAACGCGTTCTCGA 3 17107429X X X X X X GAATTCAACACGGTATCAATGAACAA 3 17108386X X X X X X GAATTCACGGTAATGTTGAGCTTGCA 3 17109561

X X GAATTCGGAGATGAATCTTTGGTTTC 3 17110126X X X X X X X GAATTCAGCATGGAAAAAGTGGTGCT 3 17117547X GAATTCACTAAATTAATCAAACCTCA 3 17117567

X X X GAATTCATGGTTAATTTGTATAGATT 3 17121474X X X X X X X GAATTCTATGATACACTTATGTAGTT 3 17124404X X X X X X X GAATTCCTCTTGTCAAAAAATTTATC 3 17124424X X X X GAATTCAGGTATTCGATGGTTAATTT 3 17124841X X X X X X X GAATTCTACACTACACTAATGAGGTC 3 17124861X X X X X X X GAATTCGCCACCAGAACTACTCAGGT 3 17125227X X X X X X X GAATTCAACACCAATAGTGGATTTAG 3 17125247X X GAATTCGGTTTATTAATTATGGCAGC 3 17127568X X X X X X X GAATTCAGAATATACATTCCTTACTT 3 17127588X X X X X X X GAATTCCGTCAGTTGTGCACCCATCG 3 17129227X X X X X X GAATTCCGCAGGAAACAGTGGTCCAG 3 17129392X X X X X X X X GAATTCTACTATGGGTCCAACGTATG 3 17132377X X X X X X X GAATTCGTTTTCTACCTTACACATTC 3 17132397X X GAATTCTTGATCGATATATAGACATG 3 17133709X X X X X X X X GAATTCATAGAACCTCTAACAAATGT 3 17133729X GAATTCCATCAGATGTGCACCTTATG 3 17134538X X X X X X X X GAATTCTAGCCGCATTTGATGATGCC 3 17134558X X X X X X X X X GAATTCCCCATAAACTAAGCATATAT 3 17145004X X X X X X X X X GAATTCCCAAAAGAGTAAGGAAAAAG 3 17145024X X X X X X X X X GAATTCGAATCCTTTTGTGCGGTTTC 3 17148314X X X X X X X X X GAATTCAACATGTGATCTTCATCTAA 3 17148334

BACs in order of their FPC resultPoP Arabidopsis thalianaWhole Genome Profiling

450 Mbp estimated genome size

47,000 BACs (EcoRI and HindIII libraries) ~ 13 GE in total

Available for contig building: - 5 GA runs - 300 M reads

- 196,000 unique sequence tags

- 40,000 BACs (85%) uniquely tagged, average 33 tags/BAC

WGP melonWhole Genome Profiling

WGP melon: results

549 contigs, 6416 singleton BACs Median 21 BACs / contig 78% genome coverage

Contig size distribution Melon Whole Genome Profiling

0

20

40

60

80

100

120

140

contig size (#BACs/contig)

# co

nti

gs

Whole Genome Profiling

Combining WGP and WGS

Roche GS FLX Titanium and Illumina Genome Analyzer II

Whole Genome Profiling and Whole Genome Sequencing

GS FLX Titanium sequencing (15 X): 10 GS FLX Titanium random shotgun runs 3 3-kb and 4 long jump p.e. GS FLX Titanium runs

Illumina GA II paired-end sequencing (30 X): 500 bp, 2 kb and 10 kb

Status: GS and GA sequencing completed GS assembly completed GA assembly in progress

WGS melon genomeWhole Genome Profiling and Whole Genome Sequencing

Combining WGP and WGSWhole Genome Profiling and Whole Genome Sequencing

EcoRI

WGP BAC contigs

EcoRIWGP sequence tag 2 - 3 kb distance → WGP sequence tag

400 nt Titanium

(Paired-end) WGS contigs

36 nt GA II

Combining WGP and WGS

Advantages: WGP provides sequence-based anchor points for WGS

Use WGP to create high-resolution sequence-based physical BAC map,

eg. 10 X BAC library coverage Use WGS to generate (deep) coverage whole genome sequence

Superior assembly: WGP map contains far less contigs (549) than

genomes sequenced by conventional random shotgun WGS strategies (tens

of thousands) and produces more accurate maps than fingerprint based PM

Cost reduction: no Sanger sequencing required

Direct access to BAC clones in regions of interest

Whole Genome Profiling and Whole Genome Sequencing

StatusWGP Tomato

4 types of BAC libraries: HindIII 15360 clones 120Kbp insert EcoRI 15360 clones 120Kbp insert MboI 15360 clones 120Kbp insert

20 pools total 5.5 GBp / 950 Mbp = 5.7 x

Random sheared (Lucigen) 50688 clones 90 kb insert 16 pools Total 4.6 Gbp / 950 Mbp = 4.8 x

Total nr of clones: 96786 of which 92160 are analyzed (95%) Approximately 85% RE bacs deconvolutable Approximately 60% of sheared bacs deconvolutable

Comparison WGP resultsWGP Tomato

515771average nr reads/tag

403140average nr tags/BAC

74%84%75%% tagged BACs (FPC ready)

67,74239,9354,599nr tagged BACs (FPC ready)

336,258181,25465,734nr unique tags

42%50%43%% deconvolutable reads

136.796.012.1nr deconvolutable reads (M)

326.919128.2nr OK reads generated (M)

E/ME/ME/Menzyme combination

12.113.25.9genome equivalents BACs tested

92,16047,6166,144nr BACs tested

262626tag length (incl. restriction site)

950450130genome size (Mbp)

TomatoMelonArabidopsis

What nextWGP Tomato

Finish last 5% (planned for next run) Contiging with FPC Deliver data

EU-SOL: Integrate with WGS data

Amplicon Express:Robert BogdenKeith StormoQuanzhou Tao

454 Life Sciences / Roche Applied Science:Jason AffourtitBrian DesanyHans Lunstroo

University of Udine:Michele Morgante

CBSG / EU SOL:Willem StiekemaRoeland van HamRené Klein Lankhorst

BioSeeds companies:Rijk ZwaanEnza ZadenVilmorin & CieTakii & Co

Keygene N.V.:Upstream Research Applied ResearchMarcel Prins René HofstedeMarjo de Ruiter Anker SørensenHein van der Poel Richard FeronMarjolijn Kelder Martin ZevenbergenAnita Bonné Linda de LeeuwNathalie van Orsouw Alberto MaurerEsther Verstege Marco van SchriekTaco Jesse Jeroen Rombout

Bio-informatics ICT Jan van Oeveren Kornelis StolAntoine Janssen Harold Verstegen Contact:Hanne Volpin [email protected] Jifeng Tang [email protected]

Business Development Jon Wittendorp Herco van Liere Mark van Haaren

Keygene N.V. owns patents and patent applications covering its Whole Genome technologies

Thanks to:Whole Genome Profiling and Whole Genome Sequencing