Double-Ended Shotgun Sequencing of PA14 Daniel G. Lee 10/30/02.

22
Double-Ended Shotgun Sequencing of PA14 Daniel G. Lee 10/30/02

Transcript of Double-Ended Shotgun Sequencing of PA14 Daniel G. Lee 10/30/02.

Page 1: Double-Ended Shotgun Sequencing of PA14 Daniel G. Lee 10/30/02.

Double-Ended Shotgun Sequencing of PA14

Daniel G. Lee

10/30/02

Page 2: Double-Ended Shotgun Sequencing of PA14 Daniel G. Lee 10/30/02.

Determination of PA14 Genomic Sequence and Whole-Genome Alignment with PAO1

• The complete genome of a related P. aeruginosa strain, PAO1, has been determined. The genome size is 6.2 Mb.

• PAO1 is less virulent than PA14 in almost all of our model hosts.

• PA14 contains additional DNA (sometimes large islands of DNA) not found in PAO1. Some of these additional genes may be responsible for the enhanced virulence of PA14.

• A complete PA14 genomic sequence will allow us to:

– identify all the (DNA) differences between PA14 and PAO1 (and later evaluate their contribution to virulence).

– Simplify the bioinformatics component of the PA14 Unigene library.

– Design a microarray (whole-genome or PA14-specific).

Page 3: Double-Ended Shotgun Sequencing of PA14 Daniel G. Lee 10/30/02.

PA14 Sequencing - Outline

1. Sequencing workflow.

2. Finishing.

3. Annotation and whole-genome alignment.

4. Integration with PA14 insertion library.

5. Requirements for publication.

Page 4: Double-Ended Shotgun Sequencing of PA14 Daniel G. Lee 10/30/02.

PA14 Genomic DNA Prep

1. Original PA14 RifR isolate from LGR.

2. 500 ml culture.

3. Alkaline lysis, with:

1. CTAB ppt.

2. 2 x Chloroform/Isoamyl Alcohol extraction.

3. 3 x Phenol extraction.

4. 1 x Phenol/Chloroform/Isoamyl Alcohol extraction.

5. 1 x Chloroform/Isoamyl Alcohol extraction.

4. Isoamyl alcohol ppt.

5. Resupended in 5 ml TE @1.174 mg/ml (5.87 mg total).

Page 5: Double-Ended Shotgun Sequencing of PA14 Daniel G. Lee 10/30/02.

Workflow for Double-Ended Shotgun Sequencing of PA14

PA14 Genomic DNA Prep

Plasmid preps of PA14 library.

Linear amplification of inserts using

dideoxy terminators

Sequencing of amplification

products

Genome-wide alignment with PAO1

• order contigs.

• identify gaps for sequence finishing.

• identify differences between PA14 and PAO1.

Finishing and Annotation

Contig assembly (PHRED and

PHRAP)

Shear PA14 DNA and

Size Fractionate

Ligate PA14 fragments into vector and transform E. coli

Page 6: Double-Ended Shotgun Sequencing of PA14 Daniel G. Lee 10/30/02.

PA14 GenomicDNA Prep

Plasmid preps ofPA14 library.

Linear amplificationof inserts using

dideoxy terminators

Sequencing ofamplification

products

Genome-widealignment with PAO1

• order contigs.

• identify gaps forsequence finishing.

• identify differencesbetween PA14 and PAO1.

Finishing andAnnotation

Contig assembly(PHRED and

PHRAP)

Shear PA14DNA and

Size Fractionate

Ligate PA14fragments into vectorand transform E. coli

Plasmid Library Construction

1. Shear DNA using nitrogen (cleavage more random than sonication).

2. Fill-in to produce blunt ends.

3. Size fractionate on low-melt agarose gel.

• 1-3 kb fragments (700 bp).

• 3-7 kb fragments

4. Ligate.

5. Transform.

6. Pick colonies.

Page 7: Double-Ended Shotgun Sequencing of PA14 Daniel G. Lee 10/30/02.

PA14 GenomicDNA Prep

Plasmid preps ofPA14 library.

Linear amplificationof inserts using

dideoxy terminators

Sequencing ofamplification

products

Genome-widealignment with PAO1

• order contigs.

• identify gaps forsequence finishing.

• identify differencesbetween PA14 and PAO1.

Finishing andAnnotation

Contig assembly(PHRED and

PHRAP)

Shear PA14DNA and

Size Fractionate

Ligate PA14fragments into vectorand transform E. coli

Plasmid Preps

1. O/N cultures in 96-well plates.

2. Freeze cell pellets.

3. Alkaline-lysis mini-preps in 96-well plates.

• 604/650 plates done.

4. Dry DNA pellets O/N.

5. Resuspend DNA in H2O.

6. Transfer to 384-well plate.• QC by agarose gel.

Page 8: Double-Ended Shotgun Sequencing of PA14 Daniel G. Lee 10/30/02.

PA14 GenomicDNA Prep

Plasmid preps ofPA14 library.

Linear amplificationof inserts using

dideoxy terminators

Sequencing ofamplification

products

Genome-widealignment with PAO1

• order contigs.

• identify gaps forsequence finishing.

• identify differencesbetween PA14 and PAO1.

Finishing andAnnotation

Contig assembly(PHRED and

PHRAP)

Shear PA14DNA and

Size Fractionate

Ligate PA14fragments into vectorand transform E. coli

Sequencing Reactions

1. Set up reaction mix:

• Labelled ddNTPs.

• dNTPs

• Buffer

• Taq

• Forward or reverse sequencing primer.

2. Aliquot rxn mix to 384-well PCR plate; freeze.

3. Add 3 l DNA to each well (or 3 l vector for “PCR control”).

4. “PCR”

Page 9: Double-Ended Shotgun Sequencing of PA14 Daniel G. Lee 10/30/02.

PA14 GenomicDNA Prep

Plasmid preps ofPA14 library.

Linear amplificationof inserts using

dideoxy terminators

Sequencing ofamplification

products

Genome-widealignment with PAO1

• order contigs.

• identify gaps forsequence finishing.

• identify differencesbetween PA14 and PAO1.

Finishing andAnnotation

Contig assembly(PHRED and

PHRAP)

Shear PA14DNA and

Size Fractionate

Ligate PA14fragments into vectorand transform E. coli

DNA Sequencing

1. EtOH ppt. PCR reactions.

2. Dry.

3. Rssp. in H2O.

4. Add previously characterized PCR reactions as “sequencing controls”.

5. ABI Prism sequencers (liquid polymer capillary sequencer, 96 reactions at a time).

Page 10: Double-Ended Shotgun Sequencing of PA14 Daniel G. Lee 10/30/02.

• ABI sequencer outputs electropherograms.

• PHRED determines identity of base as well as quality score.

Page 11: Double-Ended Shotgun Sequencing of PA14 Daniel G. Lee 10/30/02.

PA14 GenomicDNA Prep

Plasmid preps ofPA14 library.

Linear amplificationof inserts using

dideoxy terminators

Sequencing ofamplification

products

Genome-widealignment with PAO1

• order contigs.

• identify gaps forsequence finishing.

• identify differencesbetween PA14 and PAO1.

Finishing andAnnotation

Contig assembly(PHRED and

PHRAP)

Shear PA14DNA and

Size Fractionate

Ligate PA14fragments into vectorand transform E. coli

Contig Assembly

1. Electropherograms.

2. PHRED - determines

base identity and quality

score for each position.

3. PHRAP - aligns

sequences to assemble

contigs, determines

consensus sequence and

quality score for each

position.

Page 12: Double-Ended Shotgun Sequencing of PA14 Daniel G. Lee 10/30/02.

ccg-aatt-cggctttacg

aatgccggcattacg

tt--ggc-ttacgaccctttg-ggt

t--ggc-ttacg--gactaggggtacca

CCG-AATTCCGGCTTTACGACGACTTGGGGTACCA

Contig Assembly

Page 13: Double-Ended Shotgun Sequencing of PA14 Daniel G. Lee 10/30/02.

PA14 Sequencing:Current Status (as of 10/17/02)

1. Total amount of sequence: ~ 6 Mb (6.5 X coverage)• 72,000 sequences (36,000 clones).• 390 (out of ~650) 96-well plates sequenced.

• 604 plates mini-prepped

2. Total number of “contigs”: < 2000?• 1 contig ~ 44 kb• 1 contig ~ 35 kb• ~ 10 contigs > 25 kb• ~ 12 contigs 20-25 kb• most contigs are 5-10 kb.• Library consists of ~1 kb inserts (current plans to introduce a

library of 3-6 kb inserts).

As of 10/21/02, one contig 73 kb, many > 50 kb.

Page 14: Double-Ended Shotgun Sequencing of PA14 Daniel G. Lee 10/30/02.

Workflow for Double-Ended Shotgun Sequencing of PA14

PA14 Genomic DNA Prep

Plasmid preps of PA14 library.

Linear amplification of inserts using

dideoxy terminators

Sequencing of amplification

products

Genome-wide alignment with PAO1

• order contigs.

• identify gaps for sequence finishing.

• identify differences between PA14 and PAO1.

Finishing and Annotation

Contig assembly (PHRED and

PHRAP)

Shear PA14 DNA and

Size Fractionate

Ligate PA14 fragments into vector and transform E. coli

Page 15: Double-Ended Shotgun Sequencing of PA14 Daniel G. Lee 10/30/02.

Comparisons of PAO1 and PA14

A

B

A: gaps in regions corresponding to PAO1 sequence.B: gaps in PA14-specific regions.

Page 16: Double-Ended Shotgun Sequencing of PA14 Daniel G. Lee 10/30/02.

1. Software Packages available from TIGR for Alignments.

• MUMmer 2.1 - aligns MUMs (maximal unique matches) for two input sequences (two 3-4 Mb genomes aligned in under 30 seconds, using less than 100 Mb of memory, on a typical desktop computer running Unix/Linux).

• NUCmer - alignments of highly similar sequences that may have large rearrangements (i.e. -- a group of assembly contigs vs. a complete genome).

• PROmer - amino acid translation in all 6 frames for protein/peptide alignments. Useful for comparative genome annotation.

2. DisplayMUMs for graphical analysis of MUMmer output.

Tools for Genome-Wide Alignments of PAO1 and PA14

Page 17: Double-Ended Shotgun Sequencing of PA14 Daniel G. Lee 10/30/02.

1. PROmer - amino acid translation in all 6 frames for protein/peptide alignments. Useful for comparative genome annotation.

2. Jonathan’s automated suite of annotation tools (Hrp project)

Tools for Annotation of PA14

Page 18: Double-Ended Shotgun Sequencing of PA14 Daniel G. Lee 10/30/02.

Approaches for Finishing

1. PCR amplification and directed sequencing of gapped regions.

2. Isolation of cosmid clones spanning gaps, subcloning, sequencing of subclones (using universal primers).

3. (Direct genomic sequencing).

4. (Altering sequencing reaction conditions for regions that are difficult to sequence through).

Page 19: Double-Ended Shotgun Sequencing of PA14 Daniel G. Lee 10/30/02.

Finishing

A

B

Methods:

• PCR.

• Cosmids

• (Directed sequencing)

• (Altered rxn. Conditions)

Considerations:

• Type of gap.

• Anticipated size of gap.

• Quality/nature of sequence at junction.

Page 20: Double-Ended Shotgun Sequencing of PA14 Daniel G. Lee 10/30/02.

Integration with PA14 Unigene Library

Subject for BLAST

Searches

Verify PA14 Sequences

(close gaps, improve

sequence quality)

Assign Insert Coordinates

Assign Identity of Disrupted

ORF

PAO1

PA14 Contigs

Finished PA14

Sequence (annotated)

Page 21: Double-Ended Shotgun Sequencing of PA14 Daniel G. Lee 10/30/02.

Requirements for Publication

1. “Finished” PA14 sequence.a) Sufficient quality.b) No gaps?

2. Annotation.3. Comparison to PAO1.

4. What else?a) Virulence data?b) Proteomics?c) Others?

Page 22: Double-Ended Shotgun Sequencing of PA14 Daniel G. Lee 10/30/02.

ACKNOWLEDGEMENTS

MGH:

N. LiberatiS. MiyataJ. UrbachF. Ausubel

X. HeM. SaucierL. Rahme

Harvard Partners Genome Center:

K. MontgomeryG. GrillsL. LiW. BrownJ. DeckerR. ElliotL. GendalK. OsbornA. ParerraC. XiP. JuelsR. Kucherlapati