BioSci 203 blumberg lecture 6 page 1 © copyright Bruce Blumberg 2001-2005. All rights reserved Bio...

Post on 21-Dec-2015

215 views 1 download

Tags:

Transcript of BioSci 203 blumberg lecture 6 page 1 © copyright Bruce Blumberg 2001-2005. All rights reserved Bio...

BioSci 203 blumberg lecture 6 page 1 ©copyright Bruce Blumberg 2001-2005. All rights reserved

Bio Sci 203 bb-lecture 6 – DNA sequence analysis

• Bruce Blumberg (blumberg@uci.edu)– office – 2113E McGaugh Hall– 824-8573– office hours MWF 11-12.

• Today– Characterization of Selected DNA Sequences

• DNA sequence analysis

BioSci 203 blumberg lecture 6 page 2 ©copyright Bruce Blumberg 2001-2005. All rights reserved

How to identify your gene of interest (contd)

• You have one protein and want to identify proteins that interact with it– some sort of interaction screen is indicated

• straight biochemistry• phage display• two hybrid• in vitro expression cloning

BioSci 203 blumberg lecture 6 page 3 ©copyright Bruce Blumberg 2001-2005. All rights reserved

How to identify your gene of interest (contd)• biochemical approach

– purify cellular proteins that interact with your protein• co-immunoprecipitation• affinity chromatography• biochemical fractionation

– pure protein(s) are microsequenced• if not in database then make oligonucleotides and screen

cDNA library from appropriate tissues– advantage

• functional approach• stringency can be manipulated• can identify multimeric proteins or complexes• will work if you can purify proteins

– disadvantages• much skill required• low throughput• considerable optimization required

BioSci 203 blumberg lecture 6 page 4 ©copyright Bruce Blumberg 2001-2005. All rights reserved

How to identify your gene of interest (contd)

• Phage display screening (a.k.a. panning)– requires a library that expresses

inserts as fusion proteins with a phage capsid protein

• most are M13 based• some lambda phages used

– prepare target protein• as affinity matrix• or as radiolabeled probe

– test for interaction with library members• if using affinity matrix you purify phages from a mixture• if labeling protein one plates fusion protein library and

probes with the protein– called receptor panning based on similarity with panning

for gold

BioSci 203 blumberg lecture 6 page 5 ©copyright Bruce Blumberg 2001-2005. All rights reserved

How to identify your gene of interest (contd)

• Phage display screening (a.k.a. panning) (contd)– advantages

• stringency can be manipulated• if the affinity matrix approach works the cloning could go

rapidly– disadvantages

• Fusion proteins bias the screen against full-length cDNAs• Multiple attempts required to optimize binding• Limited targets possible• may not work for heterodimers• unlikely to work for complexes• panning can take many months for each screen

BioSci 203 blumberg lecture 6 page 6 ©copyright Bruce Blumberg 2001-2005. All rights reserved

How to identify your gene of interest (contd)

• Two hybrid screening– originally used in yeast, now

other systems possible– prepare bait - target protein fused

to DBD (GAL4) usual• stable cell line is commonly used

– prepare fusion protein library with an activation domain

– What is key factor required for success?

– approach• transfect library into cells and

either select for survival or activation of reporter gene

• purify and characterize positive clones

No activation domain in bait!

BioSci 203 blumberg lecture 6 page 7 ©copyright Bruce Blumberg 2001-2005. All rights reserved

How to identify your gene of interest (contd)

• Two hybrid screening (contd)– advantages

• seems simple and inexpensive on its face– in materials

• functional assay– disadvantages

• fusion proteins bias the screen against full-length cDNAs. • Binding parameters not manipulable• Difficult or impossible to detect interactions between proteins

and complexes.• Doesn’t work for secreted proteins• Many months to screen

– savings in materials are eaten up by salaries– avg grad student costs $30k/year– avg postdoc or tech costs $40k/year

• MANY false positives

BioSci 203 blumberg lecture 6 page 8 ©copyright Bruce Blumberg 2001-2005. All rights reserved

How to identify your gene of interest (contd)• In vitro interaction screening

– based on in vitro expression cloning (IVEC)• transcribe and translate cDNA libraries in vitro into small

pools of proteins (~100)• test these proteins for their ability to interact with your

protein of interest– EMSA, co-ip, FRET, SPA

– advantages• functional approach• smaller pools increase sensitivity• automated variant allows diversity of targets

– proteins, protein complexes, nucleic acids, protein/nucleic acid complexes, small molecule drugs

– very fast– disadvantages

• can’t detect heterodimers unless 1 partner known• expensive consumables (but cheap salaries)

– typical screen will cost $10-15K• expense of automation

BioSci 203 blumberg lecture 6 page 9 ©copyright Bruce Blumberg 2001-2005. All rights reserved

Analysis of genes and cDNAs• Characterization of cloned DNA (what do we want to know about a

new gene?– Complete DNA sequence

• cDNA sequence• genomic sequence? (promoters, introns and exons)• Restriction enzyme maps?

– where is the promoter(s)?• Alternative promoter use?• Mapping transcription start(s)

– where and when is mRNA expressed?• How abundantly is it expressed in each place?• association between expression levels and putative function?

– What is the function of this gene?• Loss-of-function analysis decisive

– Knockout or mutation– Knockdown (morpholino antisense, si RNA)– mutant mRNA e.g. dominant negative

• gain of function may be helpful– transgenic– mutant mRNA - constitutively active transcription factor

BioSci 203 blumberg lecture 6 page 10 ©copyright Bruce Blumberg 2001-2005. All rights reserved

DNA Sequence analysis• Complete DNA sequence (all nts both strands, no gaps)

– complete sequence is desirable but takes time• how long depends on size and strategy employed

– which strategy to use depends on various factors• how large is the clone?

– cDNA– genomic

• How fast is sequence required?

• sequencing strategies– primer walking– cloning and sequencing of restriction fragments– progressive deletions

• Bidirectional, unidirectional– Shotgun sequencing

• whole genome• with mapping

– map first (C. elegans)– map as you go (many)

BioSci 203 blumberg lecture 6 page 11 ©copyright Bruce Blumberg 2001-2005. All rights reserved

DNA Sequence analysis (contd)

• Primer walking - walk from the ends with oligonucleotides– sequence, back up ~50 nt from end, make a primer and continue– Why back up?

• To get adequate overlap

• May not get within 50 nt of primer with current sequencing

BioSci 203 blumberg lecture 6 page 12 ©copyright Bruce Blumberg 2001-2005. All rights reserved

DNA Sequence analysis (contd)• Primer walking (contd)

– advantages• very simple• no possibility to lose bits of DNA

– restriction mapping– deletion methods

• no restriction map needed• best choice for short DNA

– disadvantages• slowest method

– about a week between sequencing runs• oligos are not free (and not reusable)• not feasible for large sequences

– applications• cDNA sequencing when time is not critical• targeted sequencing

– verification– closing gaps in sequences

BioSci 203 blumberg lecture 6 page 13 ©copyright Bruce Blumberg 2001-2005. All rights reserved

DNA Sequence analysis (contd)• Cloning and sequencing of restriction fragments

– once the most popular method• make a restriction map, subclone fragments• sequence

– advantages• straightforward• directed approach• can go quickly• cloned fragments often useful otherwise

– RNase protection, nuclease mapping, in situ hybridization– disadvantages

• possible to lose small fragments– must run high quality analytical gels

• depends on quality of restriction map– mistaken mapping -> wrong sequence

• restriction site availability– applications

• sequencing small cDNAs• isolating regions to close gaps

BioSci 203 blumberg lecture 6 page 14 ©copyright Bruce Blumberg 2001-2005. All rights reserved

DNA Sequence analysis (contd)

• nested deletion strategies - sequential deletions from one end of the clone– cut, close and sequence

• Approach– make restriction map– use enzymes that cut in polylinker and insert– Religate, sequence from end with restriction site– repeat until finished, filling in gaps with oligos

• advantages– Fast, simple, efficient

• disadvantages– limited by restriction site availability in vector and insert– need to make a restriction map

BioSci 203 blumberg lecture 6 page 15 ©copyright Bruce Blumberg 2001-2005. All rights reserved

• nested deletion strategies (contd)– Exonuclease III-mediated deletion

• cut with polylinker enzyme– protect ends -

» 3’ overhang» phosphorothioate

• cut with enzyme between first cut and the insert

– can’t leave 3’ overhang• timed digestions with Exonuclease III• stop reactions, blunt ends• ligate and size select recombinants• sequence• advantages

– unidirectional– processivity of enzyme

gives nested deletions

DNA Sequence analysis (contd)

BioSci 203 blumberg lecture 6 page 16 ©copyright Bruce Blumberg 2001-2005. All rights reserved

DNA Sequence analysis (contd)

• Nested deletion strategies– Exonuclease III-mediated deletion (contd)

• disadvantages– need two unique restriction sites flanking insert on each

side– best used successively to get > 10kb total deletions– may not get complete overlaps of sequences

» fill in with restriction fragments or oligos• applications

– method of choice for moderate size sequencing projects» cDNAs» genomic clones

– good for closing larger gaps

BioSci 203 blumberg lecture 6 page 17 ©copyright Bruce Blumberg 2001-2005. All rights reserved

Large-Scale DNA Sequence analysis• Shotgun sequencing NOT invented by Craig Venter

– Messing 1981 first description of shotgun– Sanger lab developed current methods in 1983– approach

• blast genome into small chunks• clone these chunks

– 3-5 kb, 8 kb plasmid– 40 kb fosmid jump

repetitive sequences• sequence + assemble by computer

– A priori difficulties• how to get nice uniform distribution• how to assemble fragments• what to do about repeats?• How to minimize sequence redundancy?

BioSci 203 blumberg lecture 6 page 18 ©copyright Bruce Blumberg 2001-2005. All rights reserved

Large-Scale DNA Sequence analysis (contd)

BioSci 203 blumberg lecture 6 page 19 ©copyright Bruce Blumberg 2001-2005. All rights reserved

Large-Scale DNA Sequence analysis (contd)

BioSci 203 blumberg lecture 6 page 20 ©copyright Bruce Blumberg 2001-2005. All rights reserved

Large-Scale DNA Sequence analysis (contd)• Shotgun sequencing (contd)

– How to minimize sequence redundancy?• Best way to minimize redundancy is map before you start

– C. elegans was done this way - when the sequence was finished, it was FINISHED

» mapping took almost 10 years– mapping much too tedious and nonprofitable for Celera

» who cares about redundancy, let’s sequence and make $$

• why does redundancy matter?– Finished sequence today costs about $0.50/base

BioSci 203 blumberg lecture 6 page 21 ©copyright Bruce Blumberg 2001-2005. All rights reserved

Large-Scale DNA Sequence analysis (contd)

– Mapping by fingerprinting

– Mapping by hybridization

BioSci 203 blumberg lecture 6 page 22 ©copyright Bruce Blumberg 2001-2005. All rights reserved

Large-Scale DNA Sequence analysis (contd)– Map as you go

BioSci 203 blumberg lecture 6 page 23 ©copyright Bruce Blumberg 2001-2005. All rights reserved

Large-Scale DNA Sequence analysis (contd)

• Whole genome shotgun sequencing (Celera)– premise is that rapid generation of draft sequence is valuable– why bother trying to clone and sequence difficult regions?

• Basically just forget regions of repetitive DNA - not cost effective

– using this approach, it is easy to get to 90% finished• rule of thumb is that it takes at least as long to finish the last

5% as it took to get the first 95%– problems

• sequences done this way may never be complete as is C. elegans

• much redundant sequence with many sparse regions and lots of gaps.

• Fragment assembly for regions of highly repetitive DNA is dubious at best

• “Finished” fly and human genomes lack more than a few already characterized genes

BioSci 203 blumberg lecture 6 page 24 ©copyright Bruce Blumberg 2001-2005. All rights reserved

Large-Scale DNA Sequence analysis (contd)

• How to approach a large new genome, knowing what we know now?– Xenopus tropicalis 1.7 Gb (about ½ human)

• Whole genome shotgun• BAC end sequencing• EST sequencing

– 8 x coverage currently– How to finish?

• Gaps could be closed with BACS• Finishing dependent on additional funding