Genome Biology and Biotechnology 10. The proteome Prof. M. Zabeau Department of Plant Systems...
-
Upload
william-spencer -
Category
Documents
-
view
218 -
download
1
Transcript of Genome Biology and Biotechnology 10. The proteome Prof. M. Zabeau Department of Plant Systems...
Genome Biology and Genome Biology and BiotechnologyBiotechnology
10. The proteome 10. The proteome
Prof. M. ZabeauProf. M. ZabeauDepartment of Plant Systems Biology Department of Plant Systems Biology
Flanders Interuniversity Institute for Biotechnology (VIB)Flanders Interuniversity Institute for Biotechnology (VIB)University of GentUniversity of Gent
International course 2005International course 2005
SummarySummary
¤ Protein interactome– Yeast two-hybrid protein interaction mapping
¤ Proteome– Isolation of protein complexes
¤ Multilevel functional genomics– Combination of
• phenome analysis • protein interaction mapping
Functional Functional MapsMaps
or “-omes”or “-omes”
proteins
ORFeome
Localizome
Phenome
Transcriptome
Interactome
Proteome
Genes or proteins
Genes
Mutational phenotypes
Expression profiles
Protein interactions
1 2 3 4 5 n
DNA Interactome Protein-DNA interactions
“Conditions”
After: Vidal M., Cell, 104, 333 (2001)
Cellular, tissue location
Basic Concept of the Yeast Two-hybrid Basic Concept of the Yeast Two-hybrid SystemSystem
¤ Eukaryotic transcription factors– activate RNA polymerase II at promoters by binding to
upstream activating DNA sequences (UAS)
¤ Basic structure of eukaryotic transcription factors – The DNA binding and the activating functions are located in
physically separable domains• The DNA-binding domain (DB) • The activation domain (AD)
– The connection between DB and AD is structurally flexible
¤ Protein-protein interactions can reconstitute a functional transcription factor – by bringing the DB domain and the AD domain into close
physical proximity
Reprinted from:Vidal M. and Legrain P., Nucleic Acids Res. 27: 919 (1999)
Yeast two-hybrid systemYeast two-hybrid system
¤ ‘Architectural blueprint’ for a functional transcription factor– DB-X/AD-Y, where X and Y could be essentially any proteins from any
organism
UASUASUpstream Activating Sequence
Selectable marker geneSelectable marker gene
Gal4 transcription-activation domain
Gal4 DNA bindingdomain
bait
prey
DB
ADXY
Yeast two-hybrid systemYeast two-hybrid system
¤ The yeast two-hybrid system allows – Genetic selection of genes encoding potential interacting
proteins without the need for protein purification• System is to isolate genes encoding proteins that potentially
interact with DB-X (referred to as the ‘bait’) in complex AD-Y libraries (referred to as the ‘prey’)
– Limitations of the system include• False positives: clones with no biological relevance • False negatives: Failure to identify knowm interactions
– Stringent criteria must be used to evaluate both the specificity and the sensitivity of the assay
Reprinted from:Vidal M. and Legrain P., Nucleic Acids Res. 27: 919 (1999)
Protein Interaction Mapping in Protein Interaction Mapping in C. elegansC. elegans Using Using Proteins Involved in Vulval Development Proteins Involved in Vulval Development
¤ Landmark paper presents– First demonstration of large-scale two-hybrid analysis for
protein interaction mapping in C. elegans• starting with 27 proteins involved in vulval development in C.
Elegans
Walhout et al, Science 287: 116 (2000)
Experimental ApproachExperimental Approach
¤ Start from known genes in vulval development– Used Recombinational cloning to introduce ORFs of 29
known genes involved in vulval development into two-hybrid vectors
¤ Matrix two-hybrid experiment with 29 ORFs– Each DB-vORF/AD-vORF pairwise combination was
• tested for protein-protein interactions by scoring two-hybrid phenotypes
¤ Exhaustive two-hybrid screen – using 27 vORF-DB fusion proteins as baits to select
interactors from a AD-Y cDNA library• sequenced the selected clones: interaction sequence tag (IST)
Reprinted from:Walhout et al, Science 287: 116 (2000)
Construction of DB and AD Fusions by Construction of DB and AD Fusions by Recombinational CloningRecombinational Cloning
Phage lambda excision:Integrase, IHF & Exisionase
DNA bindingdomain
Activationdomain
Reprinted from: Walhout et al, Science 287: 116 (2000)
DB-ORF fusions AD-ORF fusions
Matrix of Two-hybrid Interactions Between the Matrix of Two-hybrid Interactions Between the vORFsvORFs
Reprinted from:Walhout et al, Science 287: 116 (2000)
Interaction Interaction Sequence Sequence Tag (IST) Tag (IST) screeningscreening
Reprinted from:Walhout et al, Science 287: 116 (2000)
ResultsResults
¤ Matrix two-hybrid experiment with 29 ORFs– ~ 50% (6 of 11) of the interactions reported were detected
• Two novel potential interactions were identified– Typically the yeast two-hybrid system will detect ~50% of
the naturally occurring interactions
¤ Two-hybrid screen– Identified 992 AD-Y encoding sequences– ISTs corresponded to a total 124 different interacting
proteins• 15 previously known
– Provides a functional annotation for 109 predicted genes
Reprinted from:Walhout et al, Science 287: 116 (2000)
Validation of Potential InteractionsValidation of Potential Interactions
¤ Conservation of interactions in other organisms– If X' and Y' are orthologs of X and Y, respectively
• X/Y conserved interactions are referred to as "interologs"
Reprinted from:Walhout et al, Science 287: 116 (2000)
Validation of Potential InteractionsValidation of Potential Interactions
¤ Systematic clustering analysis– closed loop connections between vORF- encoded proteins
• X interacts with Y, Y interacts with Z, Z interacts with W, and so
on (X/Y/Z/W/...)
Reprinted from:Walhout et al, Science 287: 116 (2000)
Mutations withSimilar phenotypes
Conclusions Conclusions
¤ Demonstrated the feasibility of generating a genome-wide protein interaction maps– Two-hybrid screens are
• Simple• sensitive • amenable to high-throughput
– Feasible using the C. elegans ORFeome
¤ Y2H detects approximately 50% of the interactions– provides a useful coverage of biologically important
interactions
Reprinted from:Walhout et al, Science 287: 116 (2000)
A Comprehensive Analysis of Protein–A Comprehensive Analysis of Protein–protein Interactions in protein Interactions in Saccharomyces Saccharomyces
CerevisiaeCerevisiae
¤ Landmark paper presents– The first Large scale high throughput mapping of protein-
protein interactions between ORFs predicted in S. cerevisiae using
– Two complementary yeast two-hybrid screening strategies• Two-hybrid array of 6.000 hybrid proteins• High-throughput library screen
Uetz et al., Nature 403: 623 (2000)
The two-hybrid array screeningThe two-hybrid array screening
¤ Two-hybrid array of 6.000 hybrid proteins comprises– Haploid yeast colonies derived from ~6,000 yeast ORFs fused
to the Gal4 activation domain (AD)– The two-hybrid array contained on 16 plates of 384 colonies
¤ Matrix screen for interactions – 192 different Gal4 DB ORF hybrids were mated to the two-
hybrid array– 192 two-hybrid array screens were performed in duplicate
• Each yielded 1–30 positives• But only ~ 20% were reproduced in the duplicate screen
¤ Putative interacting partners identified– 87/192 DB hybrids yielded putative protein–protein interactions– Identified 281 interacting protein pairs
Reprinted from: Uetz et al., Nature 403: 623 (2000)
The two-hybrid array screeningThe two-hybrid array screening
Reprinted from: Uetz et al., Nature 403: 623 (2000)
Positive control: 6,000 haploid yeast Gal4 activation domain - ORF fusions
Two-hybrid positives from a mating witha Gal4 DNA-binding domain - ORF fusion
16 microassay plates
High-Throughput Library ScreenHigh-Throughput Library Screen
¤ Used a library Made by pooling ORF-AD fusions – Each ORFs was fused separately to a gal4 activation domain – ORF-AD fusions were pooled to form an activation-domain
library• Advantage over traditional cDNA libraries is the uniform
presentation of each ORF
¤ Protein interactions were screened by – mating the 6.000 DNA-binding domain hybrids in duplicate to
the activation domain library– 817 yeast ORFs (15%) yielded protein–protein interactions
– Identified 692 interacting protein pairs• 68% of the interactions were identified multiple times
Reprinted from: Uetz et al., Nature 403: 623 (2000)
Results of the Systematic Two-Hybrid Screens Results of the Systematic Two-Hybrid Screens
¤ The matrix array screens – gave more interactors
• 45% of the 192 proteins in the array screens yielded interactions
– are much more labour- and material-intensive• limits the number of screens that can be performed• Full matrix would require testing 6.000 * 6.000 = 36.000.000
interactions!
¤ The library screens gave – fewer interactors
• 8% of the proteins tested in the library screens yielded interactions
– a much higher throughput
Reprinted from: Uetz et al., Nature 403: 623 (2000)
Analysis of the protein-protein Analysis of the protein-protein interactions interactions
¤ The analysis reveals– Interactions that place unknown proteins into a biological
context– Novel interactions between proteins involved in the same
biological function– Novel interactions that connect biological functions into
larger cellular processes
Interactions involving unknown Interactions involving unknown proteinsproteins
Reprinted from: Uetz et al., Nature 403: 623 (2000)
Interactions Between Proteins in the RNA Interactions Between Proteins in the RNA Splicing ComplexSplicing Complex
Reprinted from: Uetz et al., Nature 403: 623 (2000)
Interactions are consistent with the crystallographic data
Interaction Connecting two different Interaction Connecting two different ComplexesComplexes
Reprinted from: Uetz et al., Nature 403: 623 (2000)
spindle checkpoint complex microtubule checkpoint complex
Analysis Analysis of of
InterologInterologss
Reprinted from: Uetz et al., Nature 403: 623 (2000)
Yeast
Human
ConclusionsConclusions
¤ The two-hybrid array approach is feasible– for systematic genome-wide analysis of protein interactions
¤ The large scale mapping of protein-protein interactions reveals – many new interactions between proteins– that protein interactions should be viewed as potential
interactions that must be confirmed independently– This conclusion is supported by the fact that the results of
different screens only partially overlap
Reprinted from: Uetz et al., Nature 403: 623 (2000)
A Map of the Interactome Network of the A Map of the Interactome Network of the Metazoan Metazoan C. elegansC. elegans
¤ Paper presents– Large scale mapping of protein-protein interaction in C.
elegans using yeast two-hybrid screens with a subset of metazoan-specific proteins
• identified > 4000 interactions
– Together with already described Y2H interactions and interologs predicted in silico,
• the current version of the Worm Interactome map contains 5500 interactions
Li et. al., Science, 303, 540-543 (2004)
Worm Interactome map Worm Interactome map
Reprinted from: Li et. al., Science, 303, 540-543 (2004)
Phylogenetic classes
EukaryoticMulti cellularWorm
A Protein Interaction Map of A Protein Interaction Map of Drosophila Drosophila melanogastermelanogaster
¤ Paper presents– a two-hybrid–based protein-interaction map of the fly
proteome by screening 10,623 ORFs against cDNA libraries to produce
• a draft map of 7048 proteins and 20,405 interactions. • Computational rating of interaction confidence produced
– a high confidence interaction network of 4679 proteins and 4780 interactions showing two levels of organization
• a short-range organization, presumably corresponding to multiprotein complexes
• a more global organization, presumably corresponding to intercomplex connections
Giot et. al., Science, 302, 1727-1736 (2003)
The fly protein-The fly protein-interaction map: interaction map:
Protein Protein family/human family/human
disease orthologs disease orthologs
Reprinted from: Giot et. al., Science, 302, 1727-1736 (2003)
The fly protein-The fly protein-interaction interaction
map: map: Subcellular Subcellular
localization localization
Reprinted from: Giot et. al., Science, 302, 1727-1736 (2003)
Towards a proteome-scale map of the Towards a proteome-scale map of the human protein–protein interaction human protein–protein interaction
network network
¤ Paper presents– First step towards a systematic and comprehensive analysis
of the human interactome using• stringent, high-throughput yeast two-hybrid system to
test pairwise interactions among the products of 8,100 currently available Gateway-cloned open reading frames
Rual et. al., Nature 424: 1173-1178 (2005)
Reprinted from: Rual et. al., Nature 424: 1173-1178 (2005)
High-throughput yeast two-hybrid High-throughput yeast two-hybrid pipelinepipeline
¤ Stringent test– Second test using
GAL1::HIS3 and GAL1::lacZ
– Reduces the number of false positives
¤ Detected 2,800 interactions
Reprinted from: Rual et. al., Nature 424: 1173-1178 (2005)
Overlap of CCSB-HI1 with literature Overlap of CCSB-HI1 with literature datadata
¤ Compared the overlap between – Observed interactions– Interactions reported in the
literature
¤ Conclude that the CCSB-HI1 data set contains 1% of the human interactome– Human interactome is
estimated at 200.000 to 300.000 interactions.
Reprinted from: Rual et. al., Nature 424: 1173-1178 (2005)
Interaction network of disease-associated Interaction network of disease-associated CCSB-HI1 proteinsCCSB-HI1 proteins
¤ The human interactome will further – the understanding of
human health and disease
¤ Illustrated by – The network of disease-
associated proteins (green nodes)
• EWS protein
Functional Functional MapsMaps
or “-omes”or “-omes”
proteins
ORFeome
Localizome
Phenome
Transcriptome
Interactome
Proteome
Genes or proteins
Genes
Mutational phenotypes
Expression profiles
Protein interactions
1 2 3 4 5 n
DNA Interactome Protein-DNA interactions
“Conditions”
After: Vidal M., Cell, 104, 333 (2001)
Cellular, tissue location
Proteome AnalysisProteome Analysis
¤ Large scale and comprehensive analysis of the proteome has so far not been feasible– Lack of suitable and sensitive protein fractionation methods
• 2-D gels are limited to a few 1000 proteins only – the most abundant
– Protein characterization is slow and laborious• Despite enormous improvements in mass spectrometry, the
characterization of individual proteins remains the bottleneck
– Level of proteome characterization to date is in the order of a few 1000 proteins at best
• Represents 5% to 25% of the proteome
¤ Tandem affinity purification (TAP) technology constitutes an important breakthrough– Fast and reliable method of protein purification
A generic protein purification method for A generic protein purification method for protein complex characterizationprotein complex characterization
¤ Paper presents– a generic procedure to purify protein complexes under
native conditions using • tandem affinity purification (TAP) tag procedure
– Using a combination of high-affinity tags for purification
Rigaut et. al., Nat. Biotechnol. 17, 1030 (1999)
Reprinted from: Kumar A. and Snyder M., Nature 415, 123(2002)
Tag-based Characterization of protein Tag-based Characterization of protein complexescomplexes
High-affinity Tags High-affinity Tags ¤ High-affinity protein tags
– Must allow efficient recovery of proteins present at low concentrations
• ProtA tag: two IgG-binding units of protein A of S. aureus– released from matrix-bound IgG under denaturing conditions
• CBP tag: calmodulin-binding peptide– released from the affinity column under mild conditions
¤ Tandem affinity purification (TAP) tag – A fusion cassette encoding both the ProtA tag and the CBP
tag • Separated by a specific TEV protease recognition sequence
which allows proteolytic release of the bound material under native conditions
Reprinted from: Rigaut et. al., Nat. Biotechnol. 17, 1030 (1999)
Tandem affinity purification (TAP) tag Tandem affinity purification (TAP) tag
Reprinted from: Rigaut et. al., Nat. Biotechnol. 17, 1030 (1999)
ProtA
CBP
The TAP Purification ProcedureThe TAP Purification Procedure
Reprinted from: Rigaut et. al., Nat. Biotechnol. 17, 1030 (1999)
ProtA affinity purification step
CBP affinity purification step
TEV protease cleavage step
Advantage of the Two-step Advantage of the Two-step ProcedureProcedure
¤ Purification of U1 snRNP– Single-step affinity
purification yields a high level of contaminating proteins
– Tow-step affinity purification yields highly specific purification with very low background
Reprinted from: Rigaut et. al., Nat. Biotechnol. 17, 1030 (1999)
Functional organization of the yeast Functional organization of the yeast proteome by systematic analysis of proteome by systematic analysis of
protein complexes protein complexes
¤ Landmark paper presents– Large-scale application of the TAP technology for a
systematic analysis of multiprotein complexes from yeast• Generated gene-specific TAP tag cassettes by PCR• Insert TAP cassettes by homologous recombination at the 3' end
of the genes to generate fusion proteins in their native location • Purified protein assemblies from cellular lysates by TAP
– Separate purified assemblies by denaturing gel electrophoresis
– Digest individual bands by trypsin• Analyze peptides by MALDI–TOF MS to identify the proteins
using database search algorithms
Gavin et. al., Nature 415, 141 (2002)
Reprinted from: Gavin et. al., Nature 415, 141 (2002)
The Gene Targeting ProcedureThe Gene Targeting Procedure
TAP tag gene-specific cassette
Large-scale Analysis of Protein Large-scale Analysis of Protein ComplexesComplexes
¤ Experimental outline– Started with a selection of 1,739 genes
• 1,143 genes representing eukaryotic orthologues• 596 genes nonorthologous set
– Generated 1,167 strains expressing tagged proteins to detectable levels
– Analyzed 589 protein complexes• Comprising 418 different orthologues
– Generated 20,946 samples for mass spectrometry • Identified 16,830 proteins
– Characterized a total of 232 protein complexes• Comprising 1,440 distinct proteins ~ 25% of the ORFs in the
genome
Reprinted from: Gavin et. al., Nature 415, 141 (2002)
Purification Purification and and
IdentificatioIdentification of TAP n of TAP
ComplexesComplexes
Reprinted from: Gavin et. al., Nature 415, 141 (2002)
Reprinted from: Gavin et. al., Nature 415, 141 (2002)
Sensitivity and Specificity of the Approach Sensitivity and Specificity of the Approach
¤ Very efficient large-scale purification and identification of protein complexes – 78% of the 589 purified complexes have associated
proteins– The remaining 22% showing no interacting proteins
• May not form stable or soluble complexes• The TAP tag may interfere with complex assembly or function
¤ Complexes are stable and show the same composition when purified with different entry points– Example: the polyadenylation machinery, responsible for
eukaryotic messenger RNA cleavage and polyadenylation• Identified 12 of the 13 known components• Identified 7 new components
Reprinted from: Gavin et. al., Nature 415, 141 (2002)
The Polyadenylation Protein The Polyadenylation Protein ComplexComplex
new components of the polyadenylation
complex
Composition of the Polyadenylation Composition of the Polyadenylation ComplexComplex
Reprinted from: Gavin et. al., Nature 415, 141 (2002)
protein tagged for affinity
purification <
Reprinted from: Gavin et. al., Nature 415, 141 (2002)
Reliability of the TAP MethodReliability of the TAP Method
¤ High sensitivity– identify proteins present at 15 copies per cell
¤ High reproducibility– 70% of the proteins are detected in independent
purifications
¤ Low background– The background comprises highly expressed proteins
• Identified 17 contaminant proteins (heat-shock and ribosomal proteins)
¤ Limitations– 18% of the tagged essential genes gave no viable strains
• The carboxy-terminal tagging can impair protein function
Reprinted from: Gavin et. al., Nature 415, 141 (2002)
Organization of the purified assemblies into Organization of the purified assemblies into complexes complexes
¤ 589 purified complexes characterized– 245 complexes corresponded to 98 known multiprotein complexes
in yeast– 242 complexes correspond to 134 new complexes
¤ In total 232 annotated TAP complexes are identified– 102 proteins showed no detectable association with other proteins
Number Of Proteins Per ComplexNumber Of Proteins Per Complex
Reprinted from: Gavin et. al., Nature 415, 141 (2002)
Average of 12 proteins per complex
Functional Classification Of The Functional Classification Of The ComplexesComplexes
Reprinted from: Gavin et. al., Nature 415, 141 (2002)
wide functional distribution of complexes
Reprinted from: Gavin et. al., Nature 415, 141 (2002)
Protein Complexes are DynamicProtein Complexes are Dynamic
¤ Complexes are not necessarily of invariable composition – Using distinct tagged proteins as entry points to purify a
complex• Core components can be identified as invariably present• Regulatory components may be present differentially
¤ Dynamic complexes: e.g. signaling complexes– The interactions of a signalling enzyme may be sufficiently
strong to allow the detection of distinct cellular complexes • They may be diagnostic for the role of these enzymes in
different cellular activities
Reprinted from: Gavin et. al., Nature 415, 141 (2002)
Higher-order Organization of The Proteome MapHigher-order Organization of The Proteome Map
¤ Most complexes are linked together– Complexes belonging to the same functional class often
share components • mRNA metabolism, cell cycle, protein synthesis and turnover,
intermediate and energy metabolism
¤ Shared components linking complexes into a network– The network connections reflect physical interaction of
complexes• common architecture, localization or regulation
– Relationships between complexes suggests integration and coordination of cellular functions
– The more connected a complex, the more central its position in the network
Reprinted from: Gavin et. al., Nature 415, 141
(2002)
cell cycle
signalling
TranscriptionDNA maintenancechromatin structure RNA metabolism
protein synthesisand turnover
cell polarity and structure
intermediate and energy metabolism
membrane biogenesisand traffic
The Yeast Protein Complex NetworkThe Yeast Protein Complex Network
protein and RNA transport
Reprinted from: Gavin et. al., Nature 415, 141 (2002)
Protein Complexes Have a Similar Composition in Protein Complexes Have a Similar Composition in Yeast and HumanYeast and Human
Reprinted from: Gavin et. al., Nature 415, 141 (2002)
ConclusionsConclusions
¤ The paper clearly demonstrates the merits of the TAP technology for– characterizing protein complexes from different
compartments, including low-abundance and large complexes– TAP data and yeast two-hybrid assay data show only a very
small overlap• The two methodologies address different aspects of protein
interaction and are complementary
¤ The TAP analysis provides an outline of the eukaryotic proteome as a network of protein complexes– The human–yeast orthologous proteome represents core
functions for the eukaryotic cell • Orthologous proteins are often responsible for essential functions
Recommended readingRecommended reading
¤ Yeast two-hybrid interaction mapping– The yeast two-hybrid system
• Vidal M. and Legrain P., Nucleic Acids Res. 27: 919 (1999)
– Protein Interaction Mapping in C. elegans Using Proteins Involved in Vulval Development
– Walhout et al, Science 287: 116 (2000)
¤ Purification of protein complexes– Gavin et. al., Nature 415, 141 (2002)
Further readingFurther reading
¤ Protein Interaction Mapping– Interaction map of yeast
• Uetz et al., Nature 403: 623 (2000)
– Interaction map C. elegans• Li et. al., Science, 303, 540-543 (2004)
– Interaction map Drosphila• Giot et. al., Science, 302, 1727-1736 (2003)
¤ Purification of protein complexes– Tandem affinity purification (TAP) tag method
• Rigaut et. al., Nat. Biotechnol. 17, 1030 (1999)