Maize Production Sequencing [email protected].
-
Upload
avery-barker -
Category
Documents
-
view
219 -
download
2
Transcript of Maize Production Sequencing [email protected].
Maize Production Sequencing
Maize Production Goals
BAC End Sequencing of 220,000 Clones
Fosmid End Sequencing of 500,000 Clones
Shotgun of 16,000 BAC Clones
Maize BAC End Sequences
580,000 reads processed
567 average read length
60% success
Maize Fosmid End Sequences
850,000 processed
79% success
543 average read length
Completed today
Library Construction Pipeline
Receipt of sheared DNA from AGI
Size selection of insert DNA
Ligation into pSMART vector
Constructed 17,034 Libraries as of August 31st
MAIZE CLONES SHIPPED AND LIBRARIES CONSTRUCTED
0
400
800
1200
1600
2000
2400
2800
Oct-05Nov-05Dec-05Jan-06Feb-06Mar-06Apr-06May-06Jun-06Jul-06Aug-06Sep-06Oct-06Nov-06Dec-06Jan-07Feb-07Mar-07Apr-07May-07Jun-07Jul-07Aug-07
Date-Year
Number of Clones Shipped and
Libraries Constructed
Clones Shipped Libraries Constructed
Library Construction Pass Rate
0
200
400
600
800
1000
1200
1400
1600
1800
Nov-05Dec-05Jan-06Feb-06Mar-06Apr-06May-06Jun-06Jul-06Aug-06Sep-06Oct-06Nov-06Dec-06Jan-07Feb-07Mar-07Apr-07May-07Jun-07Jul-07Aug-07
Month-Year
Libraries Constructed (Pass-Fail)
Libraries Constructed Libraries Failed
Average Fail Rate for Library Construction was less than 5%
3.5X coverage
Clone size verification
50% paired ends
BES agreement
25% of clones failed
22% need more data
3% BES disagreement
Shotgun Criteria
Shotgun Complete for 12,211 Clones as of August 31st
MAIZE CLONES SHOTGUN COMPLETED
95 61 113 119226
484 459 436 360 418 357279
681774
577
1052
830
1082
856778
882
1197
0
200
400
600
800
1000
1200
1400
1600
1800
Nov-05Dec-05Jan-06Feb-06Mar-06Apr-06May-06Jun-06Jul-06Aug-06Sep-06Oct-06Nov-06Dec-06Jan-07Feb-07Mar-07Apr-07May-07Jun-07Jul-07Aug-07
Number of Clones
Final Production Work
660 Clones Need Library Construction
2100 Clones In Production Pipeline
Expected Completion Date December 2007
Sequence Improvement Bob Fulton
Dick McCombie
Rod Wing
Sequence Improvement Pipeline
•Shotgun_done triggers the prefinishing
pipeline
•Initial identification of “do finish”
regions
•Manual sorting and use of
autoedit(Gordon) to break apart
misassembly.
•Autofinish(Gordon) used to choose
directed reactions for all gaps and
regions of low quality in “do finish”
regions
•Reassembly and 2nd iteration of
prefinishing pipeline
•Final identification of “do finish”
regions and handoff to finishing
pipeline
0
100
200
300
400
500
600
700
800
1-5 ctg6-10 ctg11-15 ctg16-20 ctg21-25ctg26-30 ctg31-35 ctg35-40 ctg40+ ctg
before prefinish after prefinish
Clone Improvement through the Prefinishing Pipeline
End
Spanning Plasmids
Coverage (green)
Assembly View-Entire Clone
Repeat Tags
Do Finish
GSS sequence
EST sequence
Assembly View-Do Finish Region
Alignment with cDNA read pairs
Alignment with End Sequences
Pipeline stats across time
0
2000
4000
6000
8000
10000
12000
14000
16000
12 18 26 30 33 36 39 42 45 48 51 54 58 61 64 67 70 73 76 79 82 85 88 91 94 97
weeks
number of clones
library_done shotgun_done prefin_done finished
Actual Projected
Sep-07 Oct-07 Nov-07 Dec-07 Jan-08 Feb-08 Mar-08 Apr-08 May-08 Jun-08 Jul-08 Aug-08 Sep-08 Oct-08Group 17 18 19 20 21 22 23 24 25 26 27 28 29 30
GSC 220 250 300 350 400 400 400 400 400 400 400 400 300 71GSC Cumulative 2029 2279 2579 2929 3329 3729 4129 4529 4929 5329 5729 6129 6429 6500AGI 160 160 160 160 160 160 160 160 160 0 0 0 0 0AGI Cumulative 1720 1880 2040 2200 2360 2520 2680 2840 3000 3000 3000 3000 3000 3000CSHL 300 325 350 375 400 400 400 400 400 400 400 400 250 243CSHL Cumulative 1757 2082 2432 2807 3207 3607 4007 4407 4807 5207 5607 6007 6257 6500total 680 735 810 885 960 960 960 960 960 800 800 800 550 314Total Cumulative 5506 6241 7051 7936 8896 9856 10816 11776 12736 13536 14336 15136 15686 16000
Maize GenBank Submissions
Joanne Nelson
Submission Landmarks
HTGS_FULLTOPHTGS_PREFINHTGS_ACTIVEFINHTGS_IMPROVED
Improved Sequence
“Non-repetitve portions of the sequence have had sequence improvement (directed attempts) and have been labeled as ‘improved.’ Improved regions are double stranded, sequenced with an alternate chemistry or covered by high quality data (i.e. phred quality greater than or equal to 30 or approval by an experienced finisher), unless otherwise noted. Regions of low sequence complexity (such as dinucleotide repeats and small unit tandem repeats) in the improved regions have not been resolved to previously established finishing standards. BAC end sequence, cot and methyl filtered genome survey sequence and data from overlapping projects of strain B73 may have been included in this project.Where possible, contigs have been ordered and oriented based on read pairing. These regions are designated as scaffolds. Additional order and orientation will be provided upon completion of detailed analysis of the complete finished tiling path.”
Improved SequenceFEATURES Location/Qualifiers source 1..184604 /organism="Zea mays" /mol_type="genomic DNA" /db_xref="taxon:4577" /chromosome="1" /clone="CH201-132J17; ZMMBBc0132J17" misc_feature 1..69252 /note="scaffold_name:Scaffold1" misc_feature 1..34245 /note="assembly_name:Contig28 vector_side:SP6" misc_feature 32401..34245 /note="Improved sequence." unsure 34230..34245 /note="Non-repetitive but unresolved region" gap 34246..34345 /estimated_length=unknown misc_feature 34346..68071 /note="assembly_name:Contig27" misc_feature 34346..36695 /note="Improved sequence." unsure 34346..34356 /note="Non-repetitive but unresolved region" misc_feature 38146..46795 /note="Improved sequence." gap 68072..68171 /estimated_length=unknown misc_feature 68172..69252 /note="assembly_name:Contig14" gap 69253..69352 /estimated_length=unknown misc_feature 69353..132243 /note="scaffold_name:Scaffold2”
Submission Totals
HTGS_FULLTOP 3342HTGS_PREFIN 2014HTGS_ACTIVEFIN 4151HTGS_IMPROVED 2660
TOTAL 12167