BioSci 203 blumberg lecture 6 page 1 © copyright Bruce Blumberg 2001-2005. All rights reserved Bio...

24
BioSci 203 blumberg lecture 6 page 1 © copyright Bruce Blumberg 2001-2005. All rights reserved Bio Sci 203 bb-lecture 6 – DNA sequence analysis Bruce Blumberg ([email protected]) office – 2113E McGaugh Hall 824-8573 office hours MWF 11-12. Today Characterization of Selected DNA Sequences DNA sequence analysis
  • date post

    21-Dec-2015
  • Category

    Documents

  • view

    215
  • download

    1

Transcript of BioSci 203 blumberg lecture 6 page 1 © copyright Bruce Blumberg 2001-2005. All rights reserved Bio...

Page 1: BioSci 203 blumberg lecture 6 page 1 © copyright Bruce Blumberg 2001-2005. All rights reserved Bio Sci 203 bb-lecture 6 – DNA sequence analysis Bruce Blumberg.

BioSci 203 blumberg lecture 6 page 1 ©copyright Bruce Blumberg 2001-2005. All rights reserved

Bio Sci 203 bb-lecture 6 – DNA sequence analysis

• Bruce Blumberg ([email protected])– office – 2113E McGaugh Hall– 824-8573– office hours MWF 11-12.

• Today– Characterization of Selected DNA Sequences

• DNA sequence analysis

Page 2: BioSci 203 blumberg lecture 6 page 1 © copyright Bruce Blumberg 2001-2005. All rights reserved Bio Sci 203 bb-lecture 6 – DNA sequence analysis Bruce Blumberg.

BioSci 203 blumberg lecture 6 page 2 ©copyright Bruce Blumberg 2001-2005. All rights reserved

How to identify your gene of interest (contd)

• You have one protein and want to identify proteins that interact with it– some sort of interaction screen is indicated

• straight biochemistry• phage display• two hybrid• in vitro expression cloning

Page 3: BioSci 203 blumberg lecture 6 page 1 © copyright Bruce Blumberg 2001-2005. All rights reserved Bio Sci 203 bb-lecture 6 – DNA sequence analysis Bruce Blumberg.

BioSci 203 blumberg lecture 6 page 3 ©copyright Bruce Blumberg 2001-2005. All rights reserved

How to identify your gene of interest (contd)• biochemical approach

– purify cellular proteins that interact with your protein• co-immunoprecipitation• affinity chromatography• biochemical fractionation

– pure protein(s) are microsequenced• if not in database then make oligonucleotides and screen

cDNA library from appropriate tissues– advantage

• functional approach• stringency can be manipulated• can identify multimeric proteins or complexes• will work if you can purify proteins

– disadvantages• much skill required• low throughput• considerable optimization required

Page 4: BioSci 203 blumberg lecture 6 page 1 © copyright Bruce Blumberg 2001-2005. All rights reserved Bio Sci 203 bb-lecture 6 – DNA sequence analysis Bruce Blumberg.

BioSci 203 blumberg lecture 6 page 4 ©copyright Bruce Blumberg 2001-2005. All rights reserved

How to identify your gene of interest (contd)

• Phage display screening (a.k.a. panning)– requires a library that expresses

inserts as fusion proteins with a phage capsid protein

• most are M13 based• some lambda phages used

– prepare target protein• as affinity matrix• or as radiolabeled probe

– test for interaction with library members• if using affinity matrix you purify phages from a mixture• if labeling protein one plates fusion protein library and

probes with the protein– called receptor panning based on similarity with panning

for gold

Page 5: BioSci 203 blumberg lecture 6 page 1 © copyright Bruce Blumberg 2001-2005. All rights reserved Bio Sci 203 bb-lecture 6 – DNA sequence analysis Bruce Blumberg.

BioSci 203 blumberg lecture 6 page 5 ©copyright Bruce Blumberg 2001-2005. All rights reserved

How to identify your gene of interest (contd)

• Phage display screening (a.k.a. panning) (contd)– advantages

• stringency can be manipulated• if the affinity matrix approach works the cloning could go

rapidly– disadvantages

• Fusion proteins bias the screen against full-length cDNAs• Multiple attempts required to optimize binding• Limited targets possible• may not work for heterodimers• unlikely to work for complexes• panning can take many months for each screen

Page 6: BioSci 203 blumberg lecture 6 page 1 © copyright Bruce Blumberg 2001-2005. All rights reserved Bio Sci 203 bb-lecture 6 – DNA sequence analysis Bruce Blumberg.

BioSci 203 blumberg lecture 6 page 6 ©copyright Bruce Blumberg 2001-2005. All rights reserved

How to identify your gene of interest (contd)

• Two hybrid screening– originally used in yeast, now

other systems possible– prepare bait - target protein fused

to DBD (GAL4) usual• stable cell line is commonly used

– prepare fusion protein library with an activation domain

– What is key factor required for success?

– approach• transfect library into cells and

either select for survival or activation of reporter gene

• purify and characterize positive clones

No activation domain in bait!

Page 7: BioSci 203 blumberg lecture 6 page 1 © copyright Bruce Blumberg 2001-2005. All rights reserved Bio Sci 203 bb-lecture 6 – DNA sequence analysis Bruce Blumberg.

BioSci 203 blumberg lecture 6 page 7 ©copyright Bruce Blumberg 2001-2005. All rights reserved

How to identify your gene of interest (contd)

• Two hybrid screening (contd)– advantages

• seems simple and inexpensive on its face– in materials

• functional assay– disadvantages

• fusion proteins bias the screen against full-length cDNAs. • Binding parameters not manipulable• Difficult or impossible to detect interactions between proteins

and complexes.• Doesn’t work for secreted proteins• Many months to screen

– savings in materials are eaten up by salaries– avg grad student costs $30k/year– avg postdoc or tech costs $40k/year

• MANY false positives

Page 8: BioSci 203 blumberg lecture 6 page 1 © copyright Bruce Blumberg 2001-2005. All rights reserved Bio Sci 203 bb-lecture 6 – DNA sequence analysis Bruce Blumberg.

BioSci 203 blumberg lecture 6 page 8 ©copyright Bruce Blumberg 2001-2005. All rights reserved

How to identify your gene of interest (contd)• In vitro interaction screening

– based on in vitro expression cloning (IVEC)• transcribe and translate cDNA libraries in vitro into small

pools of proteins (~100)• test these proteins for their ability to interact with your

protein of interest– EMSA, co-ip, FRET, SPA

– advantages• functional approach• smaller pools increase sensitivity• automated variant allows diversity of targets

– proteins, protein complexes, nucleic acids, protein/nucleic acid complexes, small molecule drugs

– very fast– disadvantages

• can’t detect heterodimers unless 1 partner known• expensive consumables (but cheap salaries)

– typical screen will cost $10-15K• expense of automation

Page 9: BioSci 203 blumberg lecture 6 page 1 © copyright Bruce Blumberg 2001-2005. All rights reserved Bio Sci 203 bb-lecture 6 – DNA sequence analysis Bruce Blumberg.

BioSci 203 blumberg lecture 6 page 9 ©copyright Bruce Blumberg 2001-2005. All rights reserved

Analysis of genes and cDNAs• Characterization of cloned DNA (what do we want to know about a

new gene?– Complete DNA sequence

• cDNA sequence• genomic sequence? (promoters, introns and exons)• Restriction enzyme maps?

– where is the promoter(s)?• Alternative promoter use?• Mapping transcription start(s)

– where and when is mRNA expressed?• How abundantly is it expressed in each place?• association between expression levels and putative function?

– What is the function of this gene?• Loss-of-function analysis decisive

– Knockout or mutation– Knockdown (morpholino antisense, si RNA)– mutant mRNA e.g. dominant negative

• gain of function may be helpful– transgenic– mutant mRNA - constitutively active transcription factor

Page 10: BioSci 203 blumberg lecture 6 page 1 © copyright Bruce Blumberg 2001-2005. All rights reserved Bio Sci 203 bb-lecture 6 – DNA sequence analysis Bruce Blumberg.

BioSci 203 blumberg lecture 6 page 10 ©copyright Bruce Blumberg 2001-2005. All rights reserved

DNA Sequence analysis• Complete DNA sequence (all nts both strands, no gaps)

– complete sequence is desirable but takes time• how long depends on size and strategy employed

– which strategy to use depends on various factors• how large is the clone?

– cDNA– genomic

• How fast is sequence required?

• sequencing strategies– primer walking– cloning and sequencing of restriction fragments– progressive deletions

• Bidirectional, unidirectional– Shotgun sequencing

• whole genome• with mapping

– map first (C. elegans)– map as you go (many)

Page 11: BioSci 203 blumberg lecture 6 page 1 © copyright Bruce Blumberg 2001-2005. All rights reserved Bio Sci 203 bb-lecture 6 – DNA sequence analysis Bruce Blumberg.

BioSci 203 blumberg lecture 6 page 11 ©copyright Bruce Blumberg 2001-2005. All rights reserved

DNA Sequence analysis (contd)

• Primer walking - walk from the ends with oligonucleotides– sequence, back up ~50 nt from end, make a primer and continue– Why back up?

• To get adequate overlap

• May not get within 50 nt of primer with current sequencing

Page 12: BioSci 203 blumberg lecture 6 page 1 © copyright Bruce Blumberg 2001-2005. All rights reserved Bio Sci 203 bb-lecture 6 – DNA sequence analysis Bruce Blumberg.

BioSci 203 blumberg lecture 6 page 12 ©copyright Bruce Blumberg 2001-2005. All rights reserved

DNA Sequence analysis (contd)• Primer walking (contd)

– advantages• very simple• no possibility to lose bits of DNA

– restriction mapping– deletion methods

• no restriction map needed• best choice for short DNA

– disadvantages• slowest method

– about a week between sequencing runs• oligos are not free (and not reusable)• not feasible for large sequences

– applications• cDNA sequencing when time is not critical• targeted sequencing

– verification– closing gaps in sequences

Page 13: BioSci 203 blumberg lecture 6 page 1 © copyright Bruce Blumberg 2001-2005. All rights reserved Bio Sci 203 bb-lecture 6 – DNA sequence analysis Bruce Blumberg.

BioSci 203 blumberg lecture 6 page 13 ©copyright Bruce Blumberg 2001-2005. All rights reserved

DNA Sequence analysis (contd)• Cloning and sequencing of restriction fragments

– once the most popular method• make a restriction map, subclone fragments• sequence

– advantages• straightforward• directed approach• can go quickly• cloned fragments often useful otherwise

– RNase protection, nuclease mapping, in situ hybridization– disadvantages

• possible to lose small fragments– must run high quality analytical gels

• depends on quality of restriction map– mistaken mapping -> wrong sequence

• restriction site availability– applications

• sequencing small cDNAs• isolating regions to close gaps

Page 14: BioSci 203 blumberg lecture 6 page 1 © copyright Bruce Blumberg 2001-2005. All rights reserved Bio Sci 203 bb-lecture 6 – DNA sequence analysis Bruce Blumberg.

BioSci 203 blumberg lecture 6 page 14 ©copyright Bruce Blumberg 2001-2005. All rights reserved

DNA Sequence analysis (contd)

• nested deletion strategies - sequential deletions from one end of the clone– cut, close and sequence

• Approach– make restriction map– use enzymes that cut in polylinker and insert– Religate, sequence from end with restriction site– repeat until finished, filling in gaps with oligos

• advantages– Fast, simple, efficient

• disadvantages– limited by restriction site availability in vector and insert– need to make a restriction map

Page 15: BioSci 203 blumberg lecture 6 page 1 © copyright Bruce Blumberg 2001-2005. All rights reserved Bio Sci 203 bb-lecture 6 – DNA sequence analysis Bruce Blumberg.

BioSci 203 blumberg lecture 6 page 15 ©copyright Bruce Blumberg 2001-2005. All rights reserved

• nested deletion strategies (contd)– Exonuclease III-mediated deletion

• cut with polylinker enzyme– protect ends -

» 3’ overhang» phosphorothioate

• cut with enzyme between first cut and the insert

– can’t leave 3’ overhang• timed digestions with Exonuclease III• stop reactions, blunt ends• ligate and size select recombinants• sequence• advantages

– unidirectional– processivity of enzyme

gives nested deletions

DNA Sequence analysis (contd)

Page 16: BioSci 203 blumberg lecture 6 page 1 © copyright Bruce Blumberg 2001-2005. All rights reserved Bio Sci 203 bb-lecture 6 – DNA sequence analysis Bruce Blumberg.

BioSci 203 blumberg lecture 6 page 16 ©copyright Bruce Blumberg 2001-2005. All rights reserved

DNA Sequence analysis (contd)

• Nested deletion strategies– Exonuclease III-mediated deletion (contd)

• disadvantages– need two unique restriction sites flanking insert on each

side– best used successively to get > 10kb total deletions– may not get complete overlaps of sequences

» fill in with restriction fragments or oligos• applications

– method of choice for moderate size sequencing projects» cDNAs» genomic clones

– good for closing larger gaps

Page 17: BioSci 203 blumberg lecture 6 page 1 © copyright Bruce Blumberg 2001-2005. All rights reserved Bio Sci 203 bb-lecture 6 – DNA sequence analysis Bruce Blumberg.

BioSci 203 blumberg lecture 6 page 17 ©copyright Bruce Blumberg 2001-2005. All rights reserved

Large-Scale DNA Sequence analysis• Shotgun sequencing NOT invented by Craig Venter

– Messing 1981 first description of shotgun– Sanger lab developed current methods in 1983– approach

• blast genome into small chunks• clone these chunks

– 3-5 kb, 8 kb plasmid– 40 kb fosmid jump

repetitive sequences• sequence + assemble by computer

– A priori difficulties• how to get nice uniform distribution• how to assemble fragments• what to do about repeats?• How to minimize sequence redundancy?

Page 18: BioSci 203 blumberg lecture 6 page 1 © copyright Bruce Blumberg 2001-2005. All rights reserved Bio Sci 203 bb-lecture 6 – DNA sequence analysis Bruce Blumberg.

BioSci 203 blumberg lecture 6 page 18 ©copyright Bruce Blumberg 2001-2005. All rights reserved

Large-Scale DNA Sequence analysis (contd)

Page 19: BioSci 203 blumberg lecture 6 page 1 © copyright Bruce Blumberg 2001-2005. All rights reserved Bio Sci 203 bb-lecture 6 – DNA sequence analysis Bruce Blumberg.

BioSci 203 blumberg lecture 6 page 19 ©copyright Bruce Blumberg 2001-2005. All rights reserved

Large-Scale DNA Sequence analysis (contd)

Page 20: BioSci 203 blumberg lecture 6 page 1 © copyright Bruce Blumberg 2001-2005. All rights reserved Bio Sci 203 bb-lecture 6 – DNA sequence analysis Bruce Blumberg.

BioSci 203 blumberg lecture 6 page 20 ©copyright Bruce Blumberg 2001-2005. All rights reserved

Large-Scale DNA Sequence analysis (contd)• Shotgun sequencing (contd)

– How to minimize sequence redundancy?• Best way to minimize redundancy is map before you start

– C. elegans was done this way - when the sequence was finished, it was FINISHED

» mapping took almost 10 years– mapping much too tedious and nonprofitable for Celera

» who cares about redundancy, let’s sequence and make $$

• why does redundancy matter?– Finished sequence today costs about $0.50/base

Page 21: BioSci 203 blumberg lecture 6 page 1 © copyright Bruce Blumberg 2001-2005. All rights reserved Bio Sci 203 bb-lecture 6 – DNA sequence analysis Bruce Blumberg.

BioSci 203 blumberg lecture 6 page 21 ©copyright Bruce Blumberg 2001-2005. All rights reserved

Large-Scale DNA Sequence analysis (contd)

– Mapping by fingerprinting

– Mapping by hybridization

Page 22: BioSci 203 blumberg lecture 6 page 1 © copyright Bruce Blumberg 2001-2005. All rights reserved Bio Sci 203 bb-lecture 6 – DNA sequence analysis Bruce Blumberg.

BioSci 203 blumberg lecture 6 page 22 ©copyright Bruce Blumberg 2001-2005. All rights reserved

Large-Scale DNA Sequence analysis (contd)– Map as you go

Page 23: BioSci 203 blumberg lecture 6 page 1 © copyright Bruce Blumberg 2001-2005. All rights reserved Bio Sci 203 bb-lecture 6 – DNA sequence analysis Bruce Blumberg.

BioSci 203 blumberg lecture 6 page 23 ©copyright Bruce Blumberg 2001-2005. All rights reserved

Large-Scale DNA Sequence analysis (contd)

• Whole genome shotgun sequencing (Celera)– premise is that rapid generation of draft sequence is valuable– why bother trying to clone and sequence difficult regions?

• Basically just forget regions of repetitive DNA - not cost effective

– using this approach, it is easy to get to 90% finished• rule of thumb is that it takes at least as long to finish the last

5% as it took to get the first 95%– problems

• sequences done this way may never be complete as is C. elegans

• much redundant sequence with many sparse regions and lots of gaps.

• Fragment assembly for regions of highly repetitive DNA is dubious at best

• “Finished” fly and human genomes lack more than a few already characterized genes

Page 24: BioSci 203 blumberg lecture 6 page 1 © copyright Bruce Blumberg 2001-2005. All rights reserved Bio Sci 203 bb-lecture 6 – DNA sequence analysis Bruce Blumberg.

BioSci 203 blumberg lecture 6 page 24 ©copyright Bruce Blumberg 2001-2005. All rights reserved

Large-Scale DNA Sequence analysis (contd)

• How to approach a large new genome, knowing what we know now?– Xenopus tropicalis 1.7 Gb (about ½ human)

• Whole genome shotgun• BAC end sequencing• EST sequencing

– 8 x coverage currently– How to finish?

• Gaps could be closed with BACS• Finishing dependent on additional funding