Post on 14-Jan-2016
Genomics II:The Proteome
Using high-throughput methods to identify proteins and to
understand their function
Subcellular localization of the yeast proteome
• Complete genome sequences allow each ORF to be precisely tagged with a reporter molecule
• Tagged ORF proteins indicate subcellular localization– Useful for the following:
• Correlating to regulatory modules• Verifying data on protein–protein interactions• Annotating genome sequence
Attaching a GFP tag to an ORF
Fusion protein
Chromosome
PCR product
COOHNH2
Homologousrecombination
GFP HIS3MX6
ORF1 ORF2
protein GFP
FlyTrap Screen for
Protein Localization
http://flytrap.med.yale.edu/
Patterns of protein localization
Distribution of subcellular localization
Identification of unpredicted ORFs
Protein-protein interactions“The Interactome”
• Yeast two-hybrid analysis
• Protein chips
• Biochemical purification/Mass spectrometry
• Protein complementation
Yeast two-hybrid method• Goal: Determine how proteins interact with each other
• Method– Use yeast transcription factors
– Gene expression requires the following:• A DNA-binding domain• An activation domain• A basic transcription apparatus
– Attach protein1 to DNA-binding domain (bait)
– Attach protein2 to activation domain (prey)
– Reporter gene expressed only if protein1 and protein2 interact with each other
A schematic of the yeast two-hybrid method
m
n
Results from a yeast two-hybrid experiment
• Goal: To characterize protein–protein interactions among 6,144 yeast ORFs– 5,345 were successfully cloned into yeast as
both bait and prey– Identity of ORFs determined by DNA
sequencing in hybrid yeast– 692 protein–protein interaction pairs– Interactions involved 817 ORFs
Yeast two-hybrid results for flies & worms
• Worms:– Created >3000 bait constructs– Tested against two AD libraries– Mapped 4000 interactions
Flies:Flies: Screened 10,000 predicted transcriptsScreened 10,000 predicted transcripts Found 20,000 interactionsFound 20,000 interactions
Statistically assigned 4800 as “high quality” Statistically assigned 4800 as “high quality” interactionsinteractions
Caveats associated with the yeast two-hybrid method
• There is evidence that other methods may be more sensitive
• Some inaccuracy reported when compared against known protein–protein interactions– False positives– False negatives
Purification of interacting proteins
• Immunoprecipitation– Impractical on large scale (identification of
unknowns)
• Affinity purification– Biochemically practical, but too dirty
• Tandem affinity purification– Sufficient yield & purity for identification of
unknown proteins
TAP Purification
Strategy
Identification of Interacting Proteins
ProteolyticDigestion(Trypsin)
MassSpectrometricAnalysis
Identifying proteins with mass spectrometry
• Preparation of protein sample– Extraction from a gel– Digestion by proteases — e.g., trypsin
• Mass spectrometer measures mass-charge ratio of peptide fragments
• Identified peptides are compared with database– Software used to generate theoretical peptide mass
fingerprint (PMF) for all proteins in database– Match of experimental readout to database PMF
allows researchers to identify the protein
Mass spectrometry
• Measures mass-to-charge ratio
• Components of mass spectrometer– Ion source– Mass analyzer– Ion detector– Data acquisition unit
A mass spectrometer
Principle of mass spectrometry
Ion sources used for proteomics
• Proteomics requires specialized ion sources
• Electrospray Ionization (ESI)– With capillary
electrophoresis and liquid chromatography
• Matrix-assisted laser desorption/ionization (MALDI)– Extracts ions from sample
surface
ESI
MALDI
Mass analyzers used for proteomics
• Ion trap– Captures ions on the basis
of mass-to-charge ratio– Often used with ESI
• Time of flight (TOF)– Time for accelerated ion to
reach detector indicates mass-to-charge ratio
– Frequently used with MALDI
• Also other possibilities
Ion Trap
Time of Flight
Detector
A mass spectrum
Identifying proteins with mass spectrometry
• Preparation of protein sample– Extraction from a gel– Digestion by proteases — e.g., trypsin
• Mass spectrometer measures mass-charge ratio of peptide fragments
• Identified peptides are compared with database– Software used to generate theoretical peptide mass
fingerprint (PMF) for all proteins in database– Match of experimental readout to database PMF
allows researchers to identify the protein
Limitations of mass spectrometry
• Not very good at identifying minute quantities of protein
• Trouble dealing with phosphorylated proteins
• Doesn’t provide concentrations of proteins
• Improved software eliminating human analysis is necessary for high-throughput projects