Solanum lycopersicum Chromosome 4 Mapping and Finishing Update SRC-UK and Wellcome Trust Sanger...

20
Solanum lycopersicum Chromosome 4 Mapping and Finishing Update SRC-UK and Wellcome Trust Sanger Institute SOL Korea – September 2007 Wellcome Trust Medical Photographic Library

Transcript of Solanum lycopersicum Chromosome 4 Mapping and Finishing Update SRC-UK and Wellcome Trust Sanger...

Page 1: Solanum lycopersicum Chromosome 4 Mapping and Finishing Update SRC-UK and Wellcome Trust Sanger Institute SOL Korea – September 2007 Wellcome Trust Medical.

Solanum lycopersicum Chromosome 4

Mapping and Finishing Update

SRC-UK andWellcome Trust Sanger Institute

SOL Korea – September 2007

Wellcome Trust Medical Photographic Library

Page 2: Solanum lycopersicum Chromosome 4 Mapping and Finishing Update SRC-UK and Wellcome Trust Sanger Institute SOL Korea – September 2007 Wellcome Trust Medical.

Tomato Physical Map

Library No. of clones

Average Insert Genome equivalents

Fingerprints

LE_HBa 129,024 117 kb 15 X 88,000 (AGI)

SL_MboI 52,992 135 kb 7 X 43,000 (WTSI)

SL_EcoI 72,264 95-100 kb 7 X

BACs are selected for sequencing on chromosome 4 using the physical map assembled in fpc.

The map has been assembled using fingerprinted clones from 2 BAC libraries. Extending and gap filling clones are identified using end sequences. Clones are fingerprinted, entered in fpc and overlaps checked before being selected for sequencing.

Tomato BAC libraries

Page 3: Solanum lycopersicum Chromosome 4 Mapping and Finishing Update SRC-UK and Wellcome Trust Sanger Institute SOL Korea – September 2007 Wellcome Trust Medical.

Map Coverage – Chromosome 4

Chromosome 4 is represented by 45 FPC contigs that cover approximately 22.2Mb, estimated from fingerprints (5 bands/kb). 40 clones have been selected to extend original contigs based on clone end sequence matches

All contigs are anchored to the chromosome by SGN chromosome 4 markers

FISH (H. de Jong, Wageningen) has confirmed the placement of some contigs on chromosome 4, but may refute placement of >= 7 contigs. Confirmation of chromosome 4 contigs is high priority.

142 markers are missing out of the 907 SGN chromosome 4 markers from current fpc build. Overgo probes are being used to screen the BAC libraries. They may identify ~47 additional clones

The Syngenta marker data will also be used for identifying additional BACs.

Page 4: Solanum lycopersicum Chromosome 4 Mapping and Finishing Update SRC-UK and Wellcome Trust Sanger Institute SOL Korea – September 2007 Wellcome Trust Medical.

FISH Data

Confirmation of chromosome location

Verification of contig and marker placement

Assessment of heterochromatin & euchromatin distribution

This image demonstrates:

– LE_HBa114C15 on short arm

– LE_HBa308B7 on heterochromatin/centromere border

– LE_HBa20F17 on long arm

FISH performed by S. B. Chang at Prof S. Stack’s Laboratory, University of Colorado, USA.

Page 5: Solanum lycopersicum Chromosome 4 Mapping and Finishing Update SRC-UK and Wellcome Trust Sanger Institute SOL Korea – September 2007 Wellcome Trust Medical.

Chromosome 4 – Distribution of contigs

Mapped Markers

ctg503 ctg15

ctg5716

ctg5014 ctg5252

ctg5711

ctg916

ctg1406

ctg1189

ctg1795

FISH

confirmed

This shows that clones for sequencing have been selected from seed contigs along the length of the chromosome. Including those selected from putative heterochromatic regions to try to asses the boundary domains

Page 6: Solanum lycopersicum Chromosome 4 Mapping and Finishing Update SRC-UK and Wellcome Trust Sanger Institute SOL Korea – September 2007 Wellcome Trust Medical.

Distribution of Chromosome 4 Contigs

This shows that clones for sequencing have been selected from seed contigs along the length of the chromosome. Ten contigs shown are from the current 45 fpc contigs on chr4 - including those selected from putative heterochromatic regions to try to assess the boundary domains.

Chr4 Mapped Markers

ctg503 ctg15

ctg5716

ctg5014 ctg5252

ctg5711

ctg916

ctg1406

ctg1189

ctg1795

TG485 T0635 T0954 T1322 CT_At5g

37360

T1068 TG287

FISH confirmed

TG163P41P74

Analysed BAC and Number of gene models

Centromere

bTH8H22 - 4 GenesbTH36C23 – 2 GenesbTH50I18 – 3 Genes

bTH114C152 Genes

bTH308B70 Genes

bTH198L24 – 0 GenesbTH31H5 – 1 Gene

bTH132O113 Genes

bTH53M25 Genes

bTH59M167 Genes

The number of gene models obtained from the gene prediction training set

= Euchromatin= Heterochromatin

Page 7: Solanum lycopersicum Chromosome 4 Mapping and Finishing Update SRC-UK and Wellcome Trust Sanger Institute SOL Korea – September 2007 Wellcome Trust Medical.

Sequence Plot of ctg916 euchromatin

Page 8: Solanum lycopersicum Chromosome 4 Mapping and Finishing Update SRC-UK and Wellcome Trust Sanger Institute SOL Korea – September 2007 Wellcome Trust Medical.

Sequence Plot of ctg5711 euchromatin

Page 9: Solanum lycopersicum Chromosome 4 Mapping and Finishing Update SRC-UK and Wellcome Trust Sanger Institute SOL Korea – September 2007 Wellcome Trust Medical.

Sequence Plot of ctg15 (heterochromatic -euchromatic boundary region)

Same plot

as before

with greyscale

adjusted to

view repeat

features

Page 10: Solanum lycopersicum Chromosome 4 Mapping and Finishing Update SRC-UK and Wellcome Trust Sanger Institute SOL Korea – September 2007 Wellcome Trust Medical.

Sequence Plot of ctg5014 near centromere

Same plot

as before

with greyscale

adjusted to

view repeat

features

Page 11: Solanum lycopersicum Chromosome 4 Mapping and Finishing Update SRC-UK and Wellcome Trust Sanger Institute SOL Korea – September 2007 Wellcome Trust Medical.

TPF File

Tile Path Format file – tab delimited flat file

GAP type-3 ?? LE_HBa-24G5 ctg145CT990489 LE_HBa-20F17 ctg145GAP type-3 ?CT990488 LE_HBa-114C15 ctg5716? SL_MboI-143K21 ctg5716GAP type-3 ?? LE_HBa-147F16 ctg5014CT990558 LE_HBa-308B7 ctg5014GAP type-3 ?CT990624 LE_HBa-27G19 ctg15CT476825 LE_HBa-198L24 ctg15CT573298 LE_HBa-119A16 ctg15CT485992 LE_HBa-31H5 ctg15

Page 12: Solanum lycopersicum Chromosome 4 Mapping and Finishing Update SRC-UK and Wellcome Trust Sanger Institute SOL Korea – September 2007 Wellcome Trust Medical.

chr4 1 50000 1 N 50000 clone nochr4 50001 100000 2 N 50000 clone nochr4 100001 150000 3 N 50000 contig nochr4 150001 200000 4 N 50000 clone nochr4 200001 360432 5 F CT476825.1 1 160432 +chr4 360433 370113 6 F CT573298.1 2001 11681 +chr4 370114 532277 7 F CT485992.1 2001 164164 +chr4 532278 582277 8 N 50000 contig nochr4 582278 632277 9 N 50000 clone nochr4 632278 682277 10 N 50000 contig no

AGP File

Accesioned Golden Path – tab delimited flat file

Gaps and unfinished clones are entered as 50,000bp sections to more accurately represent the chromosome in each build

Order and alignment of Phase 3 finished accessions

Page 13: Solanum lycopersicum Chromosome 4 Mapping and Finishing Update SRC-UK and Wellcome Trust Sanger Institute SOL Korea – September 2007 Wellcome Trust Medical.

AGP View on SGN

Page 14: Solanum lycopersicum Chromosome 4 Mapping and Finishing Update SRC-UK and Wellcome Trust Sanger Institute SOL Korea – September 2007 Wellcome Trust Medical.

PseudoGoldenPath analysis for Contig Extension and Gap Closure

A PGP viewer is being developed to visualise sequence alignments and contig positioning

Contains finished and unfinished sequence

Unfinished clones are represented as sequence contigs

Unmasked BES aligned to PGP sequence using ssaha2

Parameters e.g. minimum percentage id = 95%, minimum of 60% of the end sequence found

Map gaps are assigned an arbitrary 5kb size

Clone candidates for contig extension checked with BLAST and fingerprinted

Aim to incorporate other data such as markers

Page 15: Solanum lycopersicum Chromosome 4 Mapping and Finishing Update SRC-UK and Wellcome Trust Sanger Institute SOL Korea – September 2007 Wellcome Trust Medical.

Closing the Map using PGP

MAP GAP

Bridging clones identified from BES alignments to sequence

Sequenced clones

53 clone extensions have been identified, including 5 merges with previously unplaced contigs. 2 merges of chromosome 4 contigs have also been made

Page 16: Solanum lycopersicum Chromosome 4 Mapping and Finishing Update SRC-UK and Wellcome Trust Sanger Institute SOL Korea – September 2007 Wellcome Trust Medical.

Extender from Fosmid Library

Fosmid end sequences deposited by Cornell have been aligned to chromosome 4 sequence

A copy of the fosmid library has been received at WTSI and ~ 50,000 clones will be end sequenced by December and the sequences deposited in the Ensembl / NCBI Trace repositories

Potential Extender

Page 17: Solanum lycopersicum Chromosome 4 Mapping and Finishing Update SRC-UK and Wellcome Trust Sanger Institute SOL Korea – September 2007 Wellcome Trust Medical.

WTSI Tomato Clone Pipeline

Pipeline Stage Number of BACs

Subcloning 34

Shotgun 21

Assembly Start 7

Auto-prefinishing 3

Finishing 11

QC Checking 4

Finished 63

Total 143

Phase 3

Phase 1

Phase 2

HTGS:

Page 18: Solanum lycopersicum Chromosome 4 Mapping and Finishing Update SRC-UK and Wellcome Trust Sanger Institute SOL Korea – September 2007 Wellcome Trust Medical.

Chromosome 4Sequence Generated

Total Sequence Available 10,666,227 bp

Total Unique Sequence 10,633,995 bp

Total amount of Finished Sequence = 7,543,322 bp

Page 19: Solanum lycopersicum Chromosome 4 Mapping and Finishing Update SRC-UK and Wellcome Trust Sanger Institute SOL Korea – September 2007 Wellcome Trust Medical.

Summary of Progress on Chromosome 4

45 map contigs have been built on chromosome 4

Clone end sequence alignments visualised with the PGP viewer are being used to extend contigs and close gaps

~100,000 fosmid end sequences will be generated by end 2007

10.6Mb of sequence has been generated, of which 7.5Mb are finished

All sequence assemblies >2kb are deposited in HTGS divisions of EMBL/GenBank/DDBJ

Page 20: Solanum lycopersicum Chromosome 4 Mapping and Finishing Update SRC-UK and Wellcome Trust Sanger Institute SOL Korea – September 2007 Wellcome Trust Medical.

Acknowledgements

Wellcome Trust Sanger Institute:Jane RogersSean HumphrayClare Riddle and Mapping Core GroupKaren McLaren and Finishing Team 46Stuart McLaren and Pre-finishing Team 58Christine Lloyd and QC Team 57Karen OliverMatt JonesCarol Scott

Imperial College London:Gerard BishopDaniel BuchanJames AbbottSarah Butcher

University of Nottingham:Graham Seymour

Scottish Crop Research Institute:Glenn Bryan

Cornell University: Lukas MuellerJim Giovannoni

MIPS/IBI Institute for Bioinformatics:Klaus MayerRemy Bruggmann

FISH ResourcesStephen Stack Group (Colorado)Hans de Jong (Wageningen)

FUNDING