Post on 11-May-2015
Computational Protein Design1 Challenges in Protein Engineering
Pablo Carbonellpablocarbonellissbgenopolefr
iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France
mSSB December 2010
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 1 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 2 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 3 40
Protein Engineering
Protein engineering is a technology that alters protein structures in order toimprove their properties in applications such as pharmaceuticals green chemistryand biofuels
The main challenge is to build more accurate models to predict whichsubstitutions are the best candidates to insert in the parent protein in order toenhance the desired property
Both experimental data and in silico predictions can contribute to the model
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 4 40
Protein Engineering
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 5 40
The Protein Engineering Cycle
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 6 40
Computational Protein Design in the Engineering Cycle
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 7 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 8 40
Locating the Substitutions
How to select the best residues to mutate in theparent protein
If detailed structural information on the parentenzyme is available a rational approach canbe applied to the design
When partial information on structure isavailable a semi-rational approach is used
If there is no information available then arandom search is used
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 9 40
Choosing the Right Strategy
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 10 40
Additivity and Cooperativity Effects
Additivity of the effects of substitutions israrely seen when screening mutants
In order to avoid dead ends typically ascreening strategy is designed based onbuilding libraries with simultaneous mutationsin order to find cooperativity effectsTesting for simultaneous mutations comes atthe cost of a larger screening
Natural evolution however has favoredsingle-step mutations beneficial althoughneutral drift in this case has probably allowedfor a larger search in the sequence space Additivitycooperativity experiments searching for high affinity
antibody variants
[Chodorge et al 2008]
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 11 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 12 40
Types of Protein Interactions
Protein-ligand binding(drug-target enzyme-substrate)
Protein-nucleotide(DNARNA) binding)
Protein-peptide interaction Protein-protein interaction
Protein-Protein interactions
Adapted from [Perkins et al 2010]
Protein-protein complexes
homo-oligomeric hetero-oligomericnon-obligate obligate(weak and strong) transient permanent
[Nooren and Thornton 2003]
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 13 40
Protein Specificity and Promiscuity
Multispecificity broad partner specificity(multiple substrates proteins ligands)
Small molecule ligand similar chemicalstructure usually with stereoselectivityProteins or peptides structural similar motifsrather than sequence motifs
Promiscuity the ability to participate in afunction other than the native one(moonlighting)
Allostery regulation of the protein by bindingof some ligand (the effector) at the allostericsite
Lock and key Induced fit
[Fischer 1894] [Koshland 1958]
Conformational selection
[Boehr et al 2009]
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 14 40
Protein Specificity and Promiscuity The Case of PPIs
PPI any physical binding between proteins that occurin vivo in the cellPPI screening methods still have some limitations
Y2H high FP-rateTAP-MS limited scalabilityLuminiscence-based methods proteome chipsco-immunoprecipitation MS real-time analysis (3rdgeneration DNA-seq)
Transient and PTM-dependent interactions are oftenmissed
Biological context developmental stageco-localization protein modifications presence ofcofactors presence of other binding partners
Protein hubs highly connected proteins related toessentiality robustness modularity evolvability Partyand date hubs under debate
single-interface multi-interface
[Kim et al 2006]
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 15 40
Data Sources
Enzymatic activityBRENDA experimental parametersKEGG MetaCyc metabolic networksCatalytic Site Atlas catalytic sites
Data validation and predictionGeneMANIA lists of genes with functionally similar or shared propertiesSTRING based on genomic context HT experiments co-expression literatureComPASS assign confidence to an interaction detected by MS
Primary PPI databasesDIP BioGRID IntAct MINTCommon languages PSICQUIC expression co-localization genetic metabolicsignaling pathways experimental data SBMLBuilding the network Cytoscape
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 16 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 17 40
Overview of Protein Engineering Technology
From a need to adjust enzyme properties for industrial processes
to the challenge of generating novel proteins for therapeutic and biomedicalapplications
GoalsIncreased catalytic function related to the parent
Altered specificity stereospecificity or affinity to interacting partners
Increased stability
Property ParametersThermostability T50
Catalytic activity kcat KM kcatKM
Binding specificity (kcatKM )A(kcatKM )B
Kd KI
Binding affinity Ka = 1Kd
∆G = minusRT ln 1Kd
A paradigm shift in the last 2decades
PCR and recombinant genetechnologies
Recreation of evolution in thelab
Computer algorithms
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 18 40
Goal 1 Increasing the Thermostability
Thermostability quantifies the ability of proteinrsquos secondary and tertiarystructures to withstand high temperatures avoiding denaturation
Thermostability is typically measured experimentally by T50 the temperature atwhich 50 of the proteins are inactivated in 10 minutes
Increasing the thermostability can be considered the first step in proteinengineering in order to make the protein tolerant to a greater range of amino acidsubstitutions
Main design techniquesSequence-based design comparison through multiple alignmentsStructure-based approach assumes that a more rigid protein will be more stable athigh temperatures
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 19 40
Goal 2 Increasing the Catalytic Activity
How to quantify enzyme activity Michaelis-Menten model of kinetics
E + Sk1
kminus1
ES k2
E + P (1)
d [ES]
dt= k1[E ][S]minus [ES](kminus1 + k2) (2)
d [P]
dt= k2[ES] (3)
k2 is also known as kcat or turnover rate (in morecomplex cases kcat is function of several rates)
kcat alone is not enough we need to quantify the affinityof the enzyme to the substrate
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 20 40
Enzyme Kinetics
AssumptionsFirst assumption the concentration of the substrate-bound enzyme [ES] isapproximately constant compared with the rate of change of the concentration ofsubstrate [S] and product [P]
d [ES]
dt= k1[E ][S]minus [ES](kminus1 + k2) asymp 0 (4)
Second assumption the total concentration of enzyme [E ]0 does not changewith time
[E ]0 = [E ] + [ES] asymp const (5)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 21 40
The Michaelis constant KM
0 = k1[S]([E ]0 minus [ES])minus [ES](kminus1 + k2) (6)
k1[S][E ]0 = k1[S][ES] + [ES](kminus1 + k2) (7)
[S][E ]0 = [S][ES] + [ES]kminus1 + k2
k1(8)
(9)
KM Michaelis constant
KM =kminus1 + k2
k1(10)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 22 40
The Michaelis Constant KM and the steady-state flux
Rate of product formation (flux)
d [P]
dt= v = k2[ES] = k2[E ]0
[S]
KM + [S](11)
v =vmax [S]
KM + [S]=
11 + KM
[S]
vmax (12)
KM can be measured as the concentration of substrate [S] that corresponds to aproduct formation yield half of the maximum
v =vmax
2(13)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 23 40
Determining KM from the concentration curve
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 24 40
Evaluating Enzyme Efficiency
kcatKM is often used as a specificity constant to compare relative enzyme ratesof reaction of pairs of substrates transformed by an enzyme
For an enzyme acting simultaneously on two substrates SA SB at rates vA vB
vA
vB=
kAcatK A
M [SA]
kBcatK B
M [SB](14)
At [SA] = [SB] kcatKM provides a measure of substrate promiscuity efficiency
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 25 40
Goal 3 Protein Binding Affinity and Specificity
Proteins can bind to different partners
Protein-ligand binding interaction with a small molecule such as drug-target orenzyme-substrate
Protein-nucleotide (DNARNA) binding in transcription regulation promotersetc
Protein-protein interactionPermanent or obligated in multi-units proteins it could have a structural or functionalroleTransient in signaling transport and regulation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 26 40
31 Protein Binding Affinity
Dissociation constant
A + Bk1
kminus1
AB (15)
d [AB]
dt= k1[A][B]minus kminus1[AB] (16)
In equilibrium
0 = k1[A][B]minus kminus1[AB] (17)
kd =kminus1
k1=
[A][B]
[AB](18)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 27 40
31 Protein Binding Affinity
Affinity constant
ka =1kd
(19)
In antibodies
Ab + Agkforward
kback
AbAg (20)
Binding free energy
∆G = minusRT ln ka = minusRT ln1kd
(21)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 28 40
Simplified Thermodynamics of an Enzymatic Reaction
[Jonas and Hollfelder in Protein Engineering Handbook (2009)]
Ground-state binding (KM )
Transition-state binding (Ktx )
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 29 40
32 Protein Binding Specificity
These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation
Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners
KI inhibition constant When an inhibitor competes with a ligand
Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands
Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs
Promiscuity the ability to participate n a function other than the native one
Allostery regulation of a protein by binding of some ligand (the effector)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40
Thermodynamics of a Reaction with 2 Competing Substrates
[Desari and Miller in Protein Engineering Handbook (2009)]
Specificity reflects differences in the absolute heights of the transition states
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40
Introducing the Substitutions
Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point
mutation deletion or insertion) is annealed to the targetregion
4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template
5 The heteroduplex is propagated by transformation in Ecoli
Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40
Recombination and DNA-shuffling
A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology
DNA-shuffling to perform functionaldomain or motif shuffling in vitro
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40
Recombinant Protein Folding
E coli is a typically first choice for expressing a heterologous protein
However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli
Some misfolding-related issues
Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases
The environment (crowding pH osmolarity etc)
Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)
Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis
E coli expressing human leptin as
inclusion body
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40
Directed Evolution
A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold
Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities
An iterative process
Identifying a good starting sequence usually containing some level of latentpromiscuity
Creation of a library of variants
Selecting variants with improved function (mutation and screening)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40
From Natural Enzymes to Protein Engineeringto Computational Protein Design
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40
Computational Protein Design1 Challenges in Protein Engineering
Pablo Carbonellpablocarbonellissbgenopolefr
iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France
mSSB December 2010
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40
Bibliography I
David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232
Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013
Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174
D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]
Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359
James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 2 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 3 40
Protein Engineering
Protein engineering is a technology that alters protein structures in order toimprove their properties in applications such as pharmaceuticals green chemistryand biofuels
The main challenge is to build more accurate models to predict whichsubstitutions are the best candidates to insert in the parent protein in order toenhance the desired property
Both experimental data and in silico predictions can contribute to the model
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 4 40
Protein Engineering
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 5 40
The Protein Engineering Cycle
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 6 40
Computational Protein Design in the Engineering Cycle
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 7 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 8 40
Locating the Substitutions
How to select the best residues to mutate in theparent protein
If detailed structural information on the parentenzyme is available a rational approach canbe applied to the design
When partial information on structure isavailable a semi-rational approach is used
If there is no information available then arandom search is used
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 9 40
Choosing the Right Strategy
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 10 40
Additivity and Cooperativity Effects
Additivity of the effects of substitutions israrely seen when screening mutants
In order to avoid dead ends typically ascreening strategy is designed based onbuilding libraries with simultaneous mutationsin order to find cooperativity effectsTesting for simultaneous mutations comes atthe cost of a larger screening
Natural evolution however has favoredsingle-step mutations beneficial althoughneutral drift in this case has probably allowedfor a larger search in the sequence space Additivitycooperativity experiments searching for high affinity
antibody variants
[Chodorge et al 2008]
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 11 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 12 40
Types of Protein Interactions
Protein-ligand binding(drug-target enzyme-substrate)
Protein-nucleotide(DNARNA) binding)
Protein-peptide interaction Protein-protein interaction
Protein-Protein interactions
Adapted from [Perkins et al 2010]
Protein-protein complexes
homo-oligomeric hetero-oligomericnon-obligate obligate(weak and strong) transient permanent
[Nooren and Thornton 2003]
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 13 40
Protein Specificity and Promiscuity
Multispecificity broad partner specificity(multiple substrates proteins ligands)
Small molecule ligand similar chemicalstructure usually with stereoselectivityProteins or peptides structural similar motifsrather than sequence motifs
Promiscuity the ability to participate in afunction other than the native one(moonlighting)
Allostery regulation of the protein by bindingof some ligand (the effector) at the allostericsite
Lock and key Induced fit
[Fischer 1894] [Koshland 1958]
Conformational selection
[Boehr et al 2009]
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 14 40
Protein Specificity and Promiscuity The Case of PPIs
PPI any physical binding between proteins that occurin vivo in the cellPPI screening methods still have some limitations
Y2H high FP-rateTAP-MS limited scalabilityLuminiscence-based methods proteome chipsco-immunoprecipitation MS real-time analysis (3rdgeneration DNA-seq)
Transient and PTM-dependent interactions are oftenmissed
Biological context developmental stageco-localization protein modifications presence ofcofactors presence of other binding partners
Protein hubs highly connected proteins related toessentiality robustness modularity evolvability Partyand date hubs under debate
single-interface multi-interface
[Kim et al 2006]
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 15 40
Data Sources
Enzymatic activityBRENDA experimental parametersKEGG MetaCyc metabolic networksCatalytic Site Atlas catalytic sites
Data validation and predictionGeneMANIA lists of genes with functionally similar or shared propertiesSTRING based on genomic context HT experiments co-expression literatureComPASS assign confidence to an interaction detected by MS
Primary PPI databasesDIP BioGRID IntAct MINTCommon languages PSICQUIC expression co-localization genetic metabolicsignaling pathways experimental data SBMLBuilding the network Cytoscape
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 16 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 17 40
Overview of Protein Engineering Technology
From a need to adjust enzyme properties for industrial processes
to the challenge of generating novel proteins for therapeutic and biomedicalapplications
GoalsIncreased catalytic function related to the parent
Altered specificity stereospecificity or affinity to interacting partners
Increased stability
Property ParametersThermostability T50
Catalytic activity kcat KM kcatKM
Binding specificity (kcatKM )A(kcatKM )B
Kd KI
Binding affinity Ka = 1Kd
∆G = minusRT ln 1Kd
A paradigm shift in the last 2decades
PCR and recombinant genetechnologies
Recreation of evolution in thelab
Computer algorithms
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 18 40
Goal 1 Increasing the Thermostability
Thermostability quantifies the ability of proteinrsquos secondary and tertiarystructures to withstand high temperatures avoiding denaturation
Thermostability is typically measured experimentally by T50 the temperature atwhich 50 of the proteins are inactivated in 10 minutes
Increasing the thermostability can be considered the first step in proteinengineering in order to make the protein tolerant to a greater range of amino acidsubstitutions
Main design techniquesSequence-based design comparison through multiple alignmentsStructure-based approach assumes that a more rigid protein will be more stable athigh temperatures
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 19 40
Goal 2 Increasing the Catalytic Activity
How to quantify enzyme activity Michaelis-Menten model of kinetics
E + Sk1
kminus1
ES k2
E + P (1)
d [ES]
dt= k1[E ][S]minus [ES](kminus1 + k2) (2)
d [P]
dt= k2[ES] (3)
k2 is also known as kcat or turnover rate (in morecomplex cases kcat is function of several rates)
kcat alone is not enough we need to quantify the affinityof the enzyme to the substrate
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 20 40
Enzyme Kinetics
AssumptionsFirst assumption the concentration of the substrate-bound enzyme [ES] isapproximately constant compared with the rate of change of the concentration ofsubstrate [S] and product [P]
d [ES]
dt= k1[E ][S]minus [ES](kminus1 + k2) asymp 0 (4)
Second assumption the total concentration of enzyme [E ]0 does not changewith time
[E ]0 = [E ] + [ES] asymp const (5)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 21 40
The Michaelis constant KM
0 = k1[S]([E ]0 minus [ES])minus [ES](kminus1 + k2) (6)
k1[S][E ]0 = k1[S][ES] + [ES](kminus1 + k2) (7)
[S][E ]0 = [S][ES] + [ES]kminus1 + k2
k1(8)
(9)
KM Michaelis constant
KM =kminus1 + k2
k1(10)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 22 40
The Michaelis Constant KM and the steady-state flux
Rate of product formation (flux)
d [P]
dt= v = k2[ES] = k2[E ]0
[S]
KM + [S](11)
v =vmax [S]
KM + [S]=
11 + KM
[S]
vmax (12)
KM can be measured as the concentration of substrate [S] that corresponds to aproduct formation yield half of the maximum
v =vmax
2(13)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 23 40
Determining KM from the concentration curve
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 24 40
Evaluating Enzyme Efficiency
kcatKM is often used as a specificity constant to compare relative enzyme ratesof reaction of pairs of substrates transformed by an enzyme
For an enzyme acting simultaneously on two substrates SA SB at rates vA vB
vA
vB=
kAcatK A
M [SA]
kBcatK B
M [SB](14)
At [SA] = [SB] kcatKM provides a measure of substrate promiscuity efficiency
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 25 40
Goal 3 Protein Binding Affinity and Specificity
Proteins can bind to different partners
Protein-ligand binding interaction with a small molecule such as drug-target orenzyme-substrate
Protein-nucleotide (DNARNA) binding in transcription regulation promotersetc
Protein-protein interactionPermanent or obligated in multi-units proteins it could have a structural or functionalroleTransient in signaling transport and regulation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 26 40
31 Protein Binding Affinity
Dissociation constant
A + Bk1
kminus1
AB (15)
d [AB]
dt= k1[A][B]minus kminus1[AB] (16)
In equilibrium
0 = k1[A][B]minus kminus1[AB] (17)
kd =kminus1
k1=
[A][B]
[AB](18)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 27 40
31 Protein Binding Affinity
Affinity constant
ka =1kd
(19)
In antibodies
Ab + Agkforward
kback
AbAg (20)
Binding free energy
∆G = minusRT ln ka = minusRT ln1kd
(21)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 28 40
Simplified Thermodynamics of an Enzymatic Reaction
[Jonas and Hollfelder in Protein Engineering Handbook (2009)]
Ground-state binding (KM )
Transition-state binding (Ktx )
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 29 40
32 Protein Binding Specificity
These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation
Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners
KI inhibition constant When an inhibitor competes with a ligand
Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands
Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs
Promiscuity the ability to participate n a function other than the native one
Allostery regulation of a protein by binding of some ligand (the effector)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40
Thermodynamics of a Reaction with 2 Competing Substrates
[Desari and Miller in Protein Engineering Handbook (2009)]
Specificity reflects differences in the absolute heights of the transition states
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40
Introducing the Substitutions
Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point
mutation deletion or insertion) is annealed to the targetregion
4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template
5 The heteroduplex is propagated by transformation in Ecoli
Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40
Recombination and DNA-shuffling
A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology
DNA-shuffling to perform functionaldomain or motif shuffling in vitro
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40
Recombinant Protein Folding
E coli is a typically first choice for expressing a heterologous protein
However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli
Some misfolding-related issues
Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases
The environment (crowding pH osmolarity etc)
Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)
Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis
E coli expressing human leptin as
inclusion body
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40
Directed Evolution
A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold
Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities
An iterative process
Identifying a good starting sequence usually containing some level of latentpromiscuity
Creation of a library of variants
Selecting variants with improved function (mutation and screening)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40
From Natural Enzymes to Protein Engineeringto Computational Protein Design
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40
Computational Protein Design1 Challenges in Protein Engineering
Pablo Carbonellpablocarbonellissbgenopolefr
iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France
mSSB December 2010
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40
Bibliography I
David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232
Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013
Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174
D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]
Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359
James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 3 40
Protein Engineering
Protein engineering is a technology that alters protein structures in order toimprove their properties in applications such as pharmaceuticals green chemistryand biofuels
The main challenge is to build more accurate models to predict whichsubstitutions are the best candidates to insert in the parent protein in order toenhance the desired property
Both experimental data and in silico predictions can contribute to the model
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 4 40
Protein Engineering
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 5 40
The Protein Engineering Cycle
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 6 40
Computational Protein Design in the Engineering Cycle
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 7 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 8 40
Locating the Substitutions
How to select the best residues to mutate in theparent protein
If detailed structural information on the parentenzyme is available a rational approach canbe applied to the design
When partial information on structure isavailable a semi-rational approach is used
If there is no information available then arandom search is used
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 9 40
Choosing the Right Strategy
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 10 40
Additivity and Cooperativity Effects
Additivity of the effects of substitutions israrely seen when screening mutants
In order to avoid dead ends typically ascreening strategy is designed based onbuilding libraries with simultaneous mutationsin order to find cooperativity effectsTesting for simultaneous mutations comes atthe cost of a larger screening
Natural evolution however has favoredsingle-step mutations beneficial althoughneutral drift in this case has probably allowedfor a larger search in the sequence space Additivitycooperativity experiments searching for high affinity
antibody variants
[Chodorge et al 2008]
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 11 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 12 40
Types of Protein Interactions
Protein-ligand binding(drug-target enzyme-substrate)
Protein-nucleotide(DNARNA) binding)
Protein-peptide interaction Protein-protein interaction
Protein-Protein interactions
Adapted from [Perkins et al 2010]
Protein-protein complexes
homo-oligomeric hetero-oligomericnon-obligate obligate(weak and strong) transient permanent
[Nooren and Thornton 2003]
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 13 40
Protein Specificity and Promiscuity
Multispecificity broad partner specificity(multiple substrates proteins ligands)
Small molecule ligand similar chemicalstructure usually with stereoselectivityProteins or peptides structural similar motifsrather than sequence motifs
Promiscuity the ability to participate in afunction other than the native one(moonlighting)
Allostery regulation of the protein by bindingof some ligand (the effector) at the allostericsite
Lock and key Induced fit
[Fischer 1894] [Koshland 1958]
Conformational selection
[Boehr et al 2009]
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 14 40
Protein Specificity and Promiscuity The Case of PPIs
PPI any physical binding between proteins that occurin vivo in the cellPPI screening methods still have some limitations
Y2H high FP-rateTAP-MS limited scalabilityLuminiscence-based methods proteome chipsco-immunoprecipitation MS real-time analysis (3rdgeneration DNA-seq)
Transient and PTM-dependent interactions are oftenmissed
Biological context developmental stageco-localization protein modifications presence ofcofactors presence of other binding partners
Protein hubs highly connected proteins related toessentiality robustness modularity evolvability Partyand date hubs under debate
single-interface multi-interface
[Kim et al 2006]
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 15 40
Data Sources
Enzymatic activityBRENDA experimental parametersKEGG MetaCyc metabolic networksCatalytic Site Atlas catalytic sites
Data validation and predictionGeneMANIA lists of genes with functionally similar or shared propertiesSTRING based on genomic context HT experiments co-expression literatureComPASS assign confidence to an interaction detected by MS
Primary PPI databasesDIP BioGRID IntAct MINTCommon languages PSICQUIC expression co-localization genetic metabolicsignaling pathways experimental data SBMLBuilding the network Cytoscape
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 16 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 17 40
Overview of Protein Engineering Technology
From a need to adjust enzyme properties for industrial processes
to the challenge of generating novel proteins for therapeutic and biomedicalapplications
GoalsIncreased catalytic function related to the parent
Altered specificity stereospecificity or affinity to interacting partners
Increased stability
Property ParametersThermostability T50
Catalytic activity kcat KM kcatKM
Binding specificity (kcatKM )A(kcatKM )B
Kd KI
Binding affinity Ka = 1Kd
∆G = minusRT ln 1Kd
A paradigm shift in the last 2decades
PCR and recombinant genetechnologies
Recreation of evolution in thelab
Computer algorithms
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 18 40
Goal 1 Increasing the Thermostability
Thermostability quantifies the ability of proteinrsquos secondary and tertiarystructures to withstand high temperatures avoiding denaturation
Thermostability is typically measured experimentally by T50 the temperature atwhich 50 of the proteins are inactivated in 10 minutes
Increasing the thermostability can be considered the first step in proteinengineering in order to make the protein tolerant to a greater range of amino acidsubstitutions
Main design techniquesSequence-based design comparison through multiple alignmentsStructure-based approach assumes that a more rigid protein will be more stable athigh temperatures
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 19 40
Goal 2 Increasing the Catalytic Activity
How to quantify enzyme activity Michaelis-Menten model of kinetics
E + Sk1
kminus1
ES k2
E + P (1)
d [ES]
dt= k1[E ][S]minus [ES](kminus1 + k2) (2)
d [P]
dt= k2[ES] (3)
k2 is also known as kcat or turnover rate (in morecomplex cases kcat is function of several rates)
kcat alone is not enough we need to quantify the affinityof the enzyme to the substrate
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 20 40
Enzyme Kinetics
AssumptionsFirst assumption the concentration of the substrate-bound enzyme [ES] isapproximately constant compared with the rate of change of the concentration ofsubstrate [S] and product [P]
d [ES]
dt= k1[E ][S]minus [ES](kminus1 + k2) asymp 0 (4)
Second assumption the total concentration of enzyme [E ]0 does not changewith time
[E ]0 = [E ] + [ES] asymp const (5)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 21 40
The Michaelis constant KM
0 = k1[S]([E ]0 minus [ES])minus [ES](kminus1 + k2) (6)
k1[S][E ]0 = k1[S][ES] + [ES](kminus1 + k2) (7)
[S][E ]0 = [S][ES] + [ES]kminus1 + k2
k1(8)
(9)
KM Michaelis constant
KM =kminus1 + k2
k1(10)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 22 40
The Michaelis Constant KM and the steady-state flux
Rate of product formation (flux)
d [P]
dt= v = k2[ES] = k2[E ]0
[S]
KM + [S](11)
v =vmax [S]
KM + [S]=
11 + KM
[S]
vmax (12)
KM can be measured as the concentration of substrate [S] that corresponds to aproduct formation yield half of the maximum
v =vmax
2(13)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 23 40
Determining KM from the concentration curve
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 24 40
Evaluating Enzyme Efficiency
kcatKM is often used as a specificity constant to compare relative enzyme ratesof reaction of pairs of substrates transformed by an enzyme
For an enzyme acting simultaneously on two substrates SA SB at rates vA vB
vA
vB=
kAcatK A
M [SA]
kBcatK B
M [SB](14)
At [SA] = [SB] kcatKM provides a measure of substrate promiscuity efficiency
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 25 40
Goal 3 Protein Binding Affinity and Specificity
Proteins can bind to different partners
Protein-ligand binding interaction with a small molecule such as drug-target orenzyme-substrate
Protein-nucleotide (DNARNA) binding in transcription regulation promotersetc
Protein-protein interactionPermanent or obligated in multi-units proteins it could have a structural or functionalroleTransient in signaling transport and regulation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 26 40
31 Protein Binding Affinity
Dissociation constant
A + Bk1
kminus1
AB (15)
d [AB]
dt= k1[A][B]minus kminus1[AB] (16)
In equilibrium
0 = k1[A][B]minus kminus1[AB] (17)
kd =kminus1
k1=
[A][B]
[AB](18)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 27 40
31 Protein Binding Affinity
Affinity constant
ka =1kd
(19)
In antibodies
Ab + Agkforward
kback
AbAg (20)
Binding free energy
∆G = minusRT ln ka = minusRT ln1kd
(21)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 28 40
Simplified Thermodynamics of an Enzymatic Reaction
[Jonas and Hollfelder in Protein Engineering Handbook (2009)]
Ground-state binding (KM )
Transition-state binding (Ktx )
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 29 40
32 Protein Binding Specificity
These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation
Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners
KI inhibition constant When an inhibitor competes with a ligand
Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands
Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs
Promiscuity the ability to participate n a function other than the native one
Allostery regulation of a protein by binding of some ligand (the effector)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40
Thermodynamics of a Reaction with 2 Competing Substrates
[Desari and Miller in Protein Engineering Handbook (2009)]
Specificity reflects differences in the absolute heights of the transition states
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40
Introducing the Substitutions
Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point
mutation deletion or insertion) is annealed to the targetregion
4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template
5 The heteroduplex is propagated by transformation in Ecoli
Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40
Recombination and DNA-shuffling
A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology
DNA-shuffling to perform functionaldomain or motif shuffling in vitro
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40
Recombinant Protein Folding
E coli is a typically first choice for expressing a heterologous protein
However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli
Some misfolding-related issues
Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases
The environment (crowding pH osmolarity etc)
Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)
Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis
E coli expressing human leptin as
inclusion body
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40
Directed Evolution
A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold
Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities
An iterative process
Identifying a good starting sequence usually containing some level of latentpromiscuity
Creation of a library of variants
Selecting variants with improved function (mutation and screening)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40
From Natural Enzymes to Protein Engineeringto Computational Protein Design
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40
Computational Protein Design1 Challenges in Protein Engineering
Pablo Carbonellpablocarbonellissbgenopolefr
iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France
mSSB December 2010
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40
Bibliography I
David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232
Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013
Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174
D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]
Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359
James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40
Protein Engineering
Protein engineering is a technology that alters protein structures in order toimprove their properties in applications such as pharmaceuticals green chemistryand biofuels
The main challenge is to build more accurate models to predict whichsubstitutions are the best candidates to insert in the parent protein in order toenhance the desired property
Both experimental data and in silico predictions can contribute to the model
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 4 40
Protein Engineering
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 5 40
The Protein Engineering Cycle
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 6 40
Computational Protein Design in the Engineering Cycle
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 7 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 8 40
Locating the Substitutions
How to select the best residues to mutate in theparent protein
If detailed structural information on the parentenzyme is available a rational approach canbe applied to the design
When partial information on structure isavailable a semi-rational approach is used
If there is no information available then arandom search is used
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 9 40
Choosing the Right Strategy
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 10 40
Additivity and Cooperativity Effects
Additivity of the effects of substitutions israrely seen when screening mutants
In order to avoid dead ends typically ascreening strategy is designed based onbuilding libraries with simultaneous mutationsin order to find cooperativity effectsTesting for simultaneous mutations comes atthe cost of a larger screening
Natural evolution however has favoredsingle-step mutations beneficial althoughneutral drift in this case has probably allowedfor a larger search in the sequence space Additivitycooperativity experiments searching for high affinity
antibody variants
[Chodorge et al 2008]
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 11 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 12 40
Types of Protein Interactions
Protein-ligand binding(drug-target enzyme-substrate)
Protein-nucleotide(DNARNA) binding)
Protein-peptide interaction Protein-protein interaction
Protein-Protein interactions
Adapted from [Perkins et al 2010]
Protein-protein complexes
homo-oligomeric hetero-oligomericnon-obligate obligate(weak and strong) transient permanent
[Nooren and Thornton 2003]
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 13 40
Protein Specificity and Promiscuity
Multispecificity broad partner specificity(multiple substrates proteins ligands)
Small molecule ligand similar chemicalstructure usually with stereoselectivityProteins or peptides structural similar motifsrather than sequence motifs
Promiscuity the ability to participate in afunction other than the native one(moonlighting)
Allostery regulation of the protein by bindingof some ligand (the effector) at the allostericsite
Lock and key Induced fit
[Fischer 1894] [Koshland 1958]
Conformational selection
[Boehr et al 2009]
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 14 40
Protein Specificity and Promiscuity The Case of PPIs
PPI any physical binding between proteins that occurin vivo in the cellPPI screening methods still have some limitations
Y2H high FP-rateTAP-MS limited scalabilityLuminiscence-based methods proteome chipsco-immunoprecipitation MS real-time analysis (3rdgeneration DNA-seq)
Transient and PTM-dependent interactions are oftenmissed
Biological context developmental stageco-localization protein modifications presence ofcofactors presence of other binding partners
Protein hubs highly connected proteins related toessentiality robustness modularity evolvability Partyand date hubs under debate
single-interface multi-interface
[Kim et al 2006]
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 15 40
Data Sources
Enzymatic activityBRENDA experimental parametersKEGG MetaCyc metabolic networksCatalytic Site Atlas catalytic sites
Data validation and predictionGeneMANIA lists of genes with functionally similar or shared propertiesSTRING based on genomic context HT experiments co-expression literatureComPASS assign confidence to an interaction detected by MS
Primary PPI databasesDIP BioGRID IntAct MINTCommon languages PSICQUIC expression co-localization genetic metabolicsignaling pathways experimental data SBMLBuilding the network Cytoscape
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 16 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 17 40
Overview of Protein Engineering Technology
From a need to adjust enzyme properties for industrial processes
to the challenge of generating novel proteins for therapeutic and biomedicalapplications
GoalsIncreased catalytic function related to the parent
Altered specificity stereospecificity or affinity to interacting partners
Increased stability
Property ParametersThermostability T50
Catalytic activity kcat KM kcatKM
Binding specificity (kcatKM )A(kcatKM )B
Kd KI
Binding affinity Ka = 1Kd
∆G = minusRT ln 1Kd
A paradigm shift in the last 2decades
PCR and recombinant genetechnologies
Recreation of evolution in thelab
Computer algorithms
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 18 40
Goal 1 Increasing the Thermostability
Thermostability quantifies the ability of proteinrsquos secondary and tertiarystructures to withstand high temperatures avoiding denaturation
Thermostability is typically measured experimentally by T50 the temperature atwhich 50 of the proteins are inactivated in 10 minutes
Increasing the thermostability can be considered the first step in proteinengineering in order to make the protein tolerant to a greater range of amino acidsubstitutions
Main design techniquesSequence-based design comparison through multiple alignmentsStructure-based approach assumes that a more rigid protein will be more stable athigh temperatures
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 19 40
Goal 2 Increasing the Catalytic Activity
How to quantify enzyme activity Michaelis-Menten model of kinetics
E + Sk1
kminus1
ES k2
E + P (1)
d [ES]
dt= k1[E ][S]minus [ES](kminus1 + k2) (2)
d [P]
dt= k2[ES] (3)
k2 is also known as kcat or turnover rate (in morecomplex cases kcat is function of several rates)
kcat alone is not enough we need to quantify the affinityof the enzyme to the substrate
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 20 40
Enzyme Kinetics
AssumptionsFirst assumption the concentration of the substrate-bound enzyme [ES] isapproximately constant compared with the rate of change of the concentration ofsubstrate [S] and product [P]
d [ES]
dt= k1[E ][S]minus [ES](kminus1 + k2) asymp 0 (4)
Second assumption the total concentration of enzyme [E ]0 does not changewith time
[E ]0 = [E ] + [ES] asymp const (5)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 21 40
The Michaelis constant KM
0 = k1[S]([E ]0 minus [ES])minus [ES](kminus1 + k2) (6)
k1[S][E ]0 = k1[S][ES] + [ES](kminus1 + k2) (7)
[S][E ]0 = [S][ES] + [ES]kminus1 + k2
k1(8)
(9)
KM Michaelis constant
KM =kminus1 + k2
k1(10)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 22 40
The Michaelis Constant KM and the steady-state flux
Rate of product formation (flux)
d [P]
dt= v = k2[ES] = k2[E ]0
[S]
KM + [S](11)
v =vmax [S]
KM + [S]=
11 + KM
[S]
vmax (12)
KM can be measured as the concentration of substrate [S] that corresponds to aproduct formation yield half of the maximum
v =vmax
2(13)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 23 40
Determining KM from the concentration curve
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 24 40
Evaluating Enzyme Efficiency
kcatKM is often used as a specificity constant to compare relative enzyme ratesof reaction of pairs of substrates transformed by an enzyme
For an enzyme acting simultaneously on two substrates SA SB at rates vA vB
vA
vB=
kAcatK A
M [SA]
kBcatK B
M [SB](14)
At [SA] = [SB] kcatKM provides a measure of substrate promiscuity efficiency
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 25 40
Goal 3 Protein Binding Affinity and Specificity
Proteins can bind to different partners
Protein-ligand binding interaction with a small molecule such as drug-target orenzyme-substrate
Protein-nucleotide (DNARNA) binding in transcription regulation promotersetc
Protein-protein interactionPermanent or obligated in multi-units proteins it could have a structural or functionalroleTransient in signaling transport and regulation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 26 40
31 Protein Binding Affinity
Dissociation constant
A + Bk1
kminus1
AB (15)
d [AB]
dt= k1[A][B]minus kminus1[AB] (16)
In equilibrium
0 = k1[A][B]minus kminus1[AB] (17)
kd =kminus1
k1=
[A][B]
[AB](18)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 27 40
31 Protein Binding Affinity
Affinity constant
ka =1kd
(19)
In antibodies
Ab + Agkforward
kback
AbAg (20)
Binding free energy
∆G = minusRT ln ka = minusRT ln1kd
(21)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 28 40
Simplified Thermodynamics of an Enzymatic Reaction
[Jonas and Hollfelder in Protein Engineering Handbook (2009)]
Ground-state binding (KM )
Transition-state binding (Ktx )
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 29 40
32 Protein Binding Specificity
These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation
Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners
KI inhibition constant When an inhibitor competes with a ligand
Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands
Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs
Promiscuity the ability to participate n a function other than the native one
Allostery regulation of a protein by binding of some ligand (the effector)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40
Thermodynamics of a Reaction with 2 Competing Substrates
[Desari and Miller in Protein Engineering Handbook (2009)]
Specificity reflects differences in the absolute heights of the transition states
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40
Introducing the Substitutions
Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point
mutation deletion or insertion) is annealed to the targetregion
4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template
5 The heteroduplex is propagated by transformation in Ecoli
Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40
Recombination and DNA-shuffling
A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology
DNA-shuffling to perform functionaldomain or motif shuffling in vitro
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40
Recombinant Protein Folding
E coli is a typically first choice for expressing a heterologous protein
However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli
Some misfolding-related issues
Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases
The environment (crowding pH osmolarity etc)
Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)
Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis
E coli expressing human leptin as
inclusion body
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40
Directed Evolution
A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold
Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities
An iterative process
Identifying a good starting sequence usually containing some level of latentpromiscuity
Creation of a library of variants
Selecting variants with improved function (mutation and screening)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40
From Natural Enzymes to Protein Engineeringto Computational Protein Design
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40
Computational Protein Design1 Challenges in Protein Engineering
Pablo Carbonellpablocarbonellissbgenopolefr
iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France
mSSB December 2010
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40
Bibliography I
David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232
Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013
Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174
D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]
Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359
James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40
Protein Engineering
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 5 40
The Protein Engineering Cycle
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 6 40
Computational Protein Design in the Engineering Cycle
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 7 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 8 40
Locating the Substitutions
How to select the best residues to mutate in theparent protein
If detailed structural information on the parentenzyme is available a rational approach canbe applied to the design
When partial information on structure isavailable a semi-rational approach is used
If there is no information available then arandom search is used
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 9 40
Choosing the Right Strategy
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 10 40
Additivity and Cooperativity Effects
Additivity of the effects of substitutions israrely seen when screening mutants
In order to avoid dead ends typically ascreening strategy is designed based onbuilding libraries with simultaneous mutationsin order to find cooperativity effectsTesting for simultaneous mutations comes atthe cost of a larger screening
Natural evolution however has favoredsingle-step mutations beneficial althoughneutral drift in this case has probably allowedfor a larger search in the sequence space Additivitycooperativity experiments searching for high affinity
antibody variants
[Chodorge et al 2008]
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 11 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 12 40
Types of Protein Interactions
Protein-ligand binding(drug-target enzyme-substrate)
Protein-nucleotide(DNARNA) binding)
Protein-peptide interaction Protein-protein interaction
Protein-Protein interactions
Adapted from [Perkins et al 2010]
Protein-protein complexes
homo-oligomeric hetero-oligomericnon-obligate obligate(weak and strong) transient permanent
[Nooren and Thornton 2003]
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 13 40
Protein Specificity and Promiscuity
Multispecificity broad partner specificity(multiple substrates proteins ligands)
Small molecule ligand similar chemicalstructure usually with stereoselectivityProteins or peptides structural similar motifsrather than sequence motifs
Promiscuity the ability to participate in afunction other than the native one(moonlighting)
Allostery regulation of the protein by bindingof some ligand (the effector) at the allostericsite
Lock and key Induced fit
[Fischer 1894] [Koshland 1958]
Conformational selection
[Boehr et al 2009]
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 14 40
Protein Specificity and Promiscuity The Case of PPIs
PPI any physical binding between proteins that occurin vivo in the cellPPI screening methods still have some limitations
Y2H high FP-rateTAP-MS limited scalabilityLuminiscence-based methods proteome chipsco-immunoprecipitation MS real-time analysis (3rdgeneration DNA-seq)
Transient and PTM-dependent interactions are oftenmissed
Biological context developmental stageco-localization protein modifications presence ofcofactors presence of other binding partners
Protein hubs highly connected proteins related toessentiality robustness modularity evolvability Partyand date hubs under debate
single-interface multi-interface
[Kim et al 2006]
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 15 40
Data Sources
Enzymatic activityBRENDA experimental parametersKEGG MetaCyc metabolic networksCatalytic Site Atlas catalytic sites
Data validation and predictionGeneMANIA lists of genes with functionally similar or shared propertiesSTRING based on genomic context HT experiments co-expression literatureComPASS assign confidence to an interaction detected by MS
Primary PPI databasesDIP BioGRID IntAct MINTCommon languages PSICQUIC expression co-localization genetic metabolicsignaling pathways experimental data SBMLBuilding the network Cytoscape
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 16 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 17 40
Overview of Protein Engineering Technology
From a need to adjust enzyme properties for industrial processes
to the challenge of generating novel proteins for therapeutic and biomedicalapplications
GoalsIncreased catalytic function related to the parent
Altered specificity stereospecificity or affinity to interacting partners
Increased stability
Property ParametersThermostability T50
Catalytic activity kcat KM kcatKM
Binding specificity (kcatKM )A(kcatKM )B
Kd KI
Binding affinity Ka = 1Kd
∆G = minusRT ln 1Kd
A paradigm shift in the last 2decades
PCR and recombinant genetechnologies
Recreation of evolution in thelab
Computer algorithms
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 18 40
Goal 1 Increasing the Thermostability
Thermostability quantifies the ability of proteinrsquos secondary and tertiarystructures to withstand high temperatures avoiding denaturation
Thermostability is typically measured experimentally by T50 the temperature atwhich 50 of the proteins are inactivated in 10 minutes
Increasing the thermostability can be considered the first step in proteinengineering in order to make the protein tolerant to a greater range of amino acidsubstitutions
Main design techniquesSequence-based design comparison through multiple alignmentsStructure-based approach assumes that a more rigid protein will be more stable athigh temperatures
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 19 40
Goal 2 Increasing the Catalytic Activity
How to quantify enzyme activity Michaelis-Menten model of kinetics
E + Sk1
kminus1
ES k2
E + P (1)
d [ES]
dt= k1[E ][S]minus [ES](kminus1 + k2) (2)
d [P]
dt= k2[ES] (3)
k2 is also known as kcat or turnover rate (in morecomplex cases kcat is function of several rates)
kcat alone is not enough we need to quantify the affinityof the enzyme to the substrate
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 20 40
Enzyme Kinetics
AssumptionsFirst assumption the concentration of the substrate-bound enzyme [ES] isapproximately constant compared with the rate of change of the concentration ofsubstrate [S] and product [P]
d [ES]
dt= k1[E ][S]minus [ES](kminus1 + k2) asymp 0 (4)
Second assumption the total concentration of enzyme [E ]0 does not changewith time
[E ]0 = [E ] + [ES] asymp const (5)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 21 40
The Michaelis constant KM
0 = k1[S]([E ]0 minus [ES])minus [ES](kminus1 + k2) (6)
k1[S][E ]0 = k1[S][ES] + [ES](kminus1 + k2) (7)
[S][E ]0 = [S][ES] + [ES]kminus1 + k2
k1(8)
(9)
KM Michaelis constant
KM =kminus1 + k2
k1(10)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 22 40
The Michaelis Constant KM and the steady-state flux
Rate of product formation (flux)
d [P]
dt= v = k2[ES] = k2[E ]0
[S]
KM + [S](11)
v =vmax [S]
KM + [S]=
11 + KM
[S]
vmax (12)
KM can be measured as the concentration of substrate [S] that corresponds to aproduct formation yield half of the maximum
v =vmax
2(13)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 23 40
Determining KM from the concentration curve
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 24 40
Evaluating Enzyme Efficiency
kcatKM is often used as a specificity constant to compare relative enzyme ratesof reaction of pairs of substrates transformed by an enzyme
For an enzyme acting simultaneously on two substrates SA SB at rates vA vB
vA
vB=
kAcatK A
M [SA]
kBcatK B
M [SB](14)
At [SA] = [SB] kcatKM provides a measure of substrate promiscuity efficiency
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 25 40
Goal 3 Protein Binding Affinity and Specificity
Proteins can bind to different partners
Protein-ligand binding interaction with a small molecule such as drug-target orenzyme-substrate
Protein-nucleotide (DNARNA) binding in transcription regulation promotersetc
Protein-protein interactionPermanent or obligated in multi-units proteins it could have a structural or functionalroleTransient in signaling transport and regulation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 26 40
31 Protein Binding Affinity
Dissociation constant
A + Bk1
kminus1
AB (15)
d [AB]
dt= k1[A][B]minus kminus1[AB] (16)
In equilibrium
0 = k1[A][B]minus kminus1[AB] (17)
kd =kminus1
k1=
[A][B]
[AB](18)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 27 40
31 Protein Binding Affinity
Affinity constant
ka =1kd
(19)
In antibodies
Ab + Agkforward
kback
AbAg (20)
Binding free energy
∆G = minusRT ln ka = minusRT ln1kd
(21)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 28 40
Simplified Thermodynamics of an Enzymatic Reaction
[Jonas and Hollfelder in Protein Engineering Handbook (2009)]
Ground-state binding (KM )
Transition-state binding (Ktx )
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 29 40
32 Protein Binding Specificity
These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation
Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners
KI inhibition constant When an inhibitor competes with a ligand
Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands
Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs
Promiscuity the ability to participate n a function other than the native one
Allostery regulation of a protein by binding of some ligand (the effector)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40
Thermodynamics of a Reaction with 2 Competing Substrates
[Desari and Miller in Protein Engineering Handbook (2009)]
Specificity reflects differences in the absolute heights of the transition states
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40
Introducing the Substitutions
Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point
mutation deletion or insertion) is annealed to the targetregion
4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template
5 The heteroduplex is propagated by transformation in Ecoli
Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40
Recombination and DNA-shuffling
A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology
DNA-shuffling to perform functionaldomain or motif shuffling in vitro
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40
Recombinant Protein Folding
E coli is a typically first choice for expressing a heterologous protein
However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli
Some misfolding-related issues
Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases
The environment (crowding pH osmolarity etc)
Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)
Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis
E coli expressing human leptin as
inclusion body
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40
Directed Evolution
A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold
Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities
An iterative process
Identifying a good starting sequence usually containing some level of latentpromiscuity
Creation of a library of variants
Selecting variants with improved function (mutation and screening)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40
From Natural Enzymes to Protein Engineeringto Computational Protein Design
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40
Computational Protein Design1 Challenges in Protein Engineering
Pablo Carbonellpablocarbonellissbgenopolefr
iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France
mSSB December 2010
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40
Bibliography I
David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232
Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013
Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174
D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]
Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359
James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40
The Protein Engineering Cycle
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 6 40
Computational Protein Design in the Engineering Cycle
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 7 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 8 40
Locating the Substitutions
How to select the best residues to mutate in theparent protein
If detailed structural information on the parentenzyme is available a rational approach canbe applied to the design
When partial information on structure isavailable a semi-rational approach is used
If there is no information available then arandom search is used
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 9 40
Choosing the Right Strategy
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 10 40
Additivity and Cooperativity Effects
Additivity of the effects of substitutions israrely seen when screening mutants
In order to avoid dead ends typically ascreening strategy is designed based onbuilding libraries with simultaneous mutationsin order to find cooperativity effectsTesting for simultaneous mutations comes atthe cost of a larger screening
Natural evolution however has favoredsingle-step mutations beneficial althoughneutral drift in this case has probably allowedfor a larger search in the sequence space Additivitycooperativity experiments searching for high affinity
antibody variants
[Chodorge et al 2008]
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 11 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 12 40
Types of Protein Interactions
Protein-ligand binding(drug-target enzyme-substrate)
Protein-nucleotide(DNARNA) binding)
Protein-peptide interaction Protein-protein interaction
Protein-Protein interactions
Adapted from [Perkins et al 2010]
Protein-protein complexes
homo-oligomeric hetero-oligomericnon-obligate obligate(weak and strong) transient permanent
[Nooren and Thornton 2003]
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 13 40
Protein Specificity and Promiscuity
Multispecificity broad partner specificity(multiple substrates proteins ligands)
Small molecule ligand similar chemicalstructure usually with stereoselectivityProteins or peptides structural similar motifsrather than sequence motifs
Promiscuity the ability to participate in afunction other than the native one(moonlighting)
Allostery regulation of the protein by bindingof some ligand (the effector) at the allostericsite
Lock and key Induced fit
[Fischer 1894] [Koshland 1958]
Conformational selection
[Boehr et al 2009]
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 14 40
Protein Specificity and Promiscuity The Case of PPIs
PPI any physical binding between proteins that occurin vivo in the cellPPI screening methods still have some limitations
Y2H high FP-rateTAP-MS limited scalabilityLuminiscence-based methods proteome chipsco-immunoprecipitation MS real-time analysis (3rdgeneration DNA-seq)
Transient and PTM-dependent interactions are oftenmissed
Biological context developmental stageco-localization protein modifications presence ofcofactors presence of other binding partners
Protein hubs highly connected proteins related toessentiality robustness modularity evolvability Partyand date hubs under debate
single-interface multi-interface
[Kim et al 2006]
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 15 40
Data Sources
Enzymatic activityBRENDA experimental parametersKEGG MetaCyc metabolic networksCatalytic Site Atlas catalytic sites
Data validation and predictionGeneMANIA lists of genes with functionally similar or shared propertiesSTRING based on genomic context HT experiments co-expression literatureComPASS assign confidence to an interaction detected by MS
Primary PPI databasesDIP BioGRID IntAct MINTCommon languages PSICQUIC expression co-localization genetic metabolicsignaling pathways experimental data SBMLBuilding the network Cytoscape
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 16 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 17 40
Overview of Protein Engineering Technology
From a need to adjust enzyme properties for industrial processes
to the challenge of generating novel proteins for therapeutic and biomedicalapplications
GoalsIncreased catalytic function related to the parent
Altered specificity stereospecificity or affinity to interacting partners
Increased stability
Property ParametersThermostability T50
Catalytic activity kcat KM kcatKM
Binding specificity (kcatKM )A(kcatKM )B
Kd KI
Binding affinity Ka = 1Kd
∆G = minusRT ln 1Kd
A paradigm shift in the last 2decades
PCR and recombinant genetechnologies
Recreation of evolution in thelab
Computer algorithms
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 18 40
Goal 1 Increasing the Thermostability
Thermostability quantifies the ability of proteinrsquos secondary and tertiarystructures to withstand high temperatures avoiding denaturation
Thermostability is typically measured experimentally by T50 the temperature atwhich 50 of the proteins are inactivated in 10 minutes
Increasing the thermostability can be considered the first step in proteinengineering in order to make the protein tolerant to a greater range of amino acidsubstitutions
Main design techniquesSequence-based design comparison through multiple alignmentsStructure-based approach assumes that a more rigid protein will be more stable athigh temperatures
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 19 40
Goal 2 Increasing the Catalytic Activity
How to quantify enzyme activity Michaelis-Menten model of kinetics
E + Sk1
kminus1
ES k2
E + P (1)
d [ES]
dt= k1[E ][S]minus [ES](kminus1 + k2) (2)
d [P]
dt= k2[ES] (3)
k2 is also known as kcat or turnover rate (in morecomplex cases kcat is function of several rates)
kcat alone is not enough we need to quantify the affinityof the enzyme to the substrate
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 20 40
Enzyme Kinetics
AssumptionsFirst assumption the concentration of the substrate-bound enzyme [ES] isapproximately constant compared with the rate of change of the concentration ofsubstrate [S] and product [P]
d [ES]
dt= k1[E ][S]minus [ES](kminus1 + k2) asymp 0 (4)
Second assumption the total concentration of enzyme [E ]0 does not changewith time
[E ]0 = [E ] + [ES] asymp const (5)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 21 40
The Michaelis constant KM
0 = k1[S]([E ]0 minus [ES])minus [ES](kminus1 + k2) (6)
k1[S][E ]0 = k1[S][ES] + [ES](kminus1 + k2) (7)
[S][E ]0 = [S][ES] + [ES]kminus1 + k2
k1(8)
(9)
KM Michaelis constant
KM =kminus1 + k2
k1(10)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 22 40
The Michaelis Constant KM and the steady-state flux
Rate of product formation (flux)
d [P]
dt= v = k2[ES] = k2[E ]0
[S]
KM + [S](11)
v =vmax [S]
KM + [S]=
11 + KM
[S]
vmax (12)
KM can be measured as the concentration of substrate [S] that corresponds to aproduct formation yield half of the maximum
v =vmax
2(13)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 23 40
Determining KM from the concentration curve
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 24 40
Evaluating Enzyme Efficiency
kcatKM is often used as a specificity constant to compare relative enzyme ratesof reaction of pairs of substrates transformed by an enzyme
For an enzyme acting simultaneously on two substrates SA SB at rates vA vB
vA
vB=
kAcatK A
M [SA]
kBcatK B
M [SB](14)
At [SA] = [SB] kcatKM provides a measure of substrate promiscuity efficiency
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 25 40
Goal 3 Protein Binding Affinity and Specificity
Proteins can bind to different partners
Protein-ligand binding interaction with a small molecule such as drug-target orenzyme-substrate
Protein-nucleotide (DNARNA) binding in transcription regulation promotersetc
Protein-protein interactionPermanent or obligated in multi-units proteins it could have a structural or functionalroleTransient in signaling transport and regulation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 26 40
31 Protein Binding Affinity
Dissociation constant
A + Bk1
kminus1
AB (15)
d [AB]
dt= k1[A][B]minus kminus1[AB] (16)
In equilibrium
0 = k1[A][B]minus kminus1[AB] (17)
kd =kminus1
k1=
[A][B]
[AB](18)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 27 40
31 Protein Binding Affinity
Affinity constant
ka =1kd
(19)
In antibodies
Ab + Agkforward
kback
AbAg (20)
Binding free energy
∆G = minusRT ln ka = minusRT ln1kd
(21)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 28 40
Simplified Thermodynamics of an Enzymatic Reaction
[Jonas and Hollfelder in Protein Engineering Handbook (2009)]
Ground-state binding (KM )
Transition-state binding (Ktx )
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 29 40
32 Protein Binding Specificity
These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation
Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners
KI inhibition constant When an inhibitor competes with a ligand
Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands
Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs
Promiscuity the ability to participate n a function other than the native one
Allostery regulation of a protein by binding of some ligand (the effector)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40
Thermodynamics of a Reaction with 2 Competing Substrates
[Desari and Miller in Protein Engineering Handbook (2009)]
Specificity reflects differences in the absolute heights of the transition states
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40
Introducing the Substitutions
Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point
mutation deletion or insertion) is annealed to the targetregion
4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template
5 The heteroduplex is propagated by transformation in Ecoli
Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40
Recombination and DNA-shuffling
A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology
DNA-shuffling to perform functionaldomain or motif shuffling in vitro
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40
Recombinant Protein Folding
E coli is a typically first choice for expressing a heterologous protein
However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli
Some misfolding-related issues
Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases
The environment (crowding pH osmolarity etc)
Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)
Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis
E coli expressing human leptin as
inclusion body
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40
Directed Evolution
A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold
Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities
An iterative process
Identifying a good starting sequence usually containing some level of latentpromiscuity
Creation of a library of variants
Selecting variants with improved function (mutation and screening)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40
From Natural Enzymes to Protein Engineeringto Computational Protein Design
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40
Computational Protein Design1 Challenges in Protein Engineering
Pablo Carbonellpablocarbonellissbgenopolefr
iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France
mSSB December 2010
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40
Bibliography I
David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232
Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013
Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174
D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]
Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359
James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40
Computational Protein Design in the Engineering Cycle
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 7 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 8 40
Locating the Substitutions
How to select the best residues to mutate in theparent protein
If detailed structural information on the parentenzyme is available a rational approach canbe applied to the design
When partial information on structure isavailable a semi-rational approach is used
If there is no information available then arandom search is used
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 9 40
Choosing the Right Strategy
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 10 40
Additivity and Cooperativity Effects
Additivity of the effects of substitutions israrely seen when screening mutants
In order to avoid dead ends typically ascreening strategy is designed based onbuilding libraries with simultaneous mutationsin order to find cooperativity effectsTesting for simultaneous mutations comes atthe cost of a larger screening
Natural evolution however has favoredsingle-step mutations beneficial althoughneutral drift in this case has probably allowedfor a larger search in the sequence space Additivitycooperativity experiments searching for high affinity
antibody variants
[Chodorge et al 2008]
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 11 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 12 40
Types of Protein Interactions
Protein-ligand binding(drug-target enzyme-substrate)
Protein-nucleotide(DNARNA) binding)
Protein-peptide interaction Protein-protein interaction
Protein-Protein interactions
Adapted from [Perkins et al 2010]
Protein-protein complexes
homo-oligomeric hetero-oligomericnon-obligate obligate(weak and strong) transient permanent
[Nooren and Thornton 2003]
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 13 40
Protein Specificity and Promiscuity
Multispecificity broad partner specificity(multiple substrates proteins ligands)
Small molecule ligand similar chemicalstructure usually with stereoselectivityProteins or peptides structural similar motifsrather than sequence motifs
Promiscuity the ability to participate in afunction other than the native one(moonlighting)
Allostery regulation of the protein by bindingof some ligand (the effector) at the allostericsite
Lock and key Induced fit
[Fischer 1894] [Koshland 1958]
Conformational selection
[Boehr et al 2009]
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 14 40
Protein Specificity and Promiscuity The Case of PPIs
PPI any physical binding between proteins that occurin vivo in the cellPPI screening methods still have some limitations
Y2H high FP-rateTAP-MS limited scalabilityLuminiscence-based methods proteome chipsco-immunoprecipitation MS real-time analysis (3rdgeneration DNA-seq)
Transient and PTM-dependent interactions are oftenmissed
Biological context developmental stageco-localization protein modifications presence ofcofactors presence of other binding partners
Protein hubs highly connected proteins related toessentiality robustness modularity evolvability Partyand date hubs under debate
single-interface multi-interface
[Kim et al 2006]
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 15 40
Data Sources
Enzymatic activityBRENDA experimental parametersKEGG MetaCyc metabolic networksCatalytic Site Atlas catalytic sites
Data validation and predictionGeneMANIA lists of genes with functionally similar or shared propertiesSTRING based on genomic context HT experiments co-expression literatureComPASS assign confidence to an interaction detected by MS
Primary PPI databasesDIP BioGRID IntAct MINTCommon languages PSICQUIC expression co-localization genetic metabolicsignaling pathways experimental data SBMLBuilding the network Cytoscape
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 16 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 17 40
Overview of Protein Engineering Technology
From a need to adjust enzyme properties for industrial processes
to the challenge of generating novel proteins for therapeutic and biomedicalapplications
GoalsIncreased catalytic function related to the parent
Altered specificity stereospecificity or affinity to interacting partners
Increased stability
Property ParametersThermostability T50
Catalytic activity kcat KM kcatKM
Binding specificity (kcatKM )A(kcatKM )B
Kd KI
Binding affinity Ka = 1Kd
∆G = minusRT ln 1Kd
A paradigm shift in the last 2decades
PCR and recombinant genetechnologies
Recreation of evolution in thelab
Computer algorithms
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 18 40
Goal 1 Increasing the Thermostability
Thermostability quantifies the ability of proteinrsquos secondary and tertiarystructures to withstand high temperatures avoiding denaturation
Thermostability is typically measured experimentally by T50 the temperature atwhich 50 of the proteins are inactivated in 10 minutes
Increasing the thermostability can be considered the first step in proteinengineering in order to make the protein tolerant to a greater range of amino acidsubstitutions
Main design techniquesSequence-based design comparison through multiple alignmentsStructure-based approach assumes that a more rigid protein will be more stable athigh temperatures
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 19 40
Goal 2 Increasing the Catalytic Activity
How to quantify enzyme activity Michaelis-Menten model of kinetics
E + Sk1
kminus1
ES k2
E + P (1)
d [ES]
dt= k1[E ][S]minus [ES](kminus1 + k2) (2)
d [P]
dt= k2[ES] (3)
k2 is also known as kcat or turnover rate (in morecomplex cases kcat is function of several rates)
kcat alone is not enough we need to quantify the affinityof the enzyme to the substrate
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 20 40
Enzyme Kinetics
AssumptionsFirst assumption the concentration of the substrate-bound enzyme [ES] isapproximately constant compared with the rate of change of the concentration ofsubstrate [S] and product [P]
d [ES]
dt= k1[E ][S]minus [ES](kminus1 + k2) asymp 0 (4)
Second assumption the total concentration of enzyme [E ]0 does not changewith time
[E ]0 = [E ] + [ES] asymp const (5)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 21 40
The Michaelis constant KM
0 = k1[S]([E ]0 minus [ES])minus [ES](kminus1 + k2) (6)
k1[S][E ]0 = k1[S][ES] + [ES](kminus1 + k2) (7)
[S][E ]0 = [S][ES] + [ES]kminus1 + k2
k1(8)
(9)
KM Michaelis constant
KM =kminus1 + k2
k1(10)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 22 40
The Michaelis Constant KM and the steady-state flux
Rate of product formation (flux)
d [P]
dt= v = k2[ES] = k2[E ]0
[S]
KM + [S](11)
v =vmax [S]
KM + [S]=
11 + KM
[S]
vmax (12)
KM can be measured as the concentration of substrate [S] that corresponds to aproduct formation yield half of the maximum
v =vmax
2(13)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 23 40
Determining KM from the concentration curve
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 24 40
Evaluating Enzyme Efficiency
kcatKM is often used as a specificity constant to compare relative enzyme ratesof reaction of pairs of substrates transformed by an enzyme
For an enzyme acting simultaneously on two substrates SA SB at rates vA vB
vA
vB=
kAcatK A
M [SA]
kBcatK B
M [SB](14)
At [SA] = [SB] kcatKM provides a measure of substrate promiscuity efficiency
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 25 40
Goal 3 Protein Binding Affinity and Specificity
Proteins can bind to different partners
Protein-ligand binding interaction with a small molecule such as drug-target orenzyme-substrate
Protein-nucleotide (DNARNA) binding in transcription regulation promotersetc
Protein-protein interactionPermanent or obligated in multi-units proteins it could have a structural or functionalroleTransient in signaling transport and regulation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 26 40
31 Protein Binding Affinity
Dissociation constant
A + Bk1
kminus1
AB (15)
d [AB]
dt= k1[A][B]minus kminus1[AB] (16)
In equilibrium
0 = k1[A][B]minus kminus1[AB] (17)
kd =kminus1
k1=
[A][B]
[AB](18)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 27 40
31 Protein Binding Affinity
Affinity constant
ka =1kd
(19)
In antibodies
Ab + Agkforward
kback
AbAg (20)
Binding free energy
∆G = minusRT ln ka = minusRT ln1kd
(21)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 28 40
Simplified Thermodynamics of an Enzymatic Reaction
[Jonas and Hollfelder in Protein Engineering Handbook (2009)]
Ground-state binding (KM )
Transition-state binding (Ktx )
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 29 40
32 Protein Binding Specificity
These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation
Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners
KI inhibition constant When an inhibitor competes with a ligand
Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands
Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs
Promiscuity the ability to participate n a function other than the native one
Allostery regulation of a protein by binding of some ligand (the effector)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40
Thermodynamics of a Reaction with 2 Competing Substrates
[Desari and Miller in Protein Engineering Handbook (2009)]
Specificity reflects differences in the absolute heights of the transition states
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40
Introducing the Substitutions
Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point
mutation deletion or insertion) is annealed to the targetregion
4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template
5 The heteroduplex is propagated by transformation in Ecoli
Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40
Recombination and DNA-shuffling
A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology
DNA-shuffling to perform functionaldomain or motif shuffling in vitro
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40
Recombinant Protein Folding
E coli is a typically first choice for expressing a heterologous protein
However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli
Some misfolding-related issues
Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases
The environment (crowding pH osmolarity etc)
Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)
Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis
E coli expressing human leptin as
inclusion body
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40
Directed Evolution
A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold
Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities
An iterative process
Identifying a good starting sequence usually containing some level of latentpromiscuity
Creation of a library of variants
Selecting variants with improved function (mutation and screening)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40
From Natural Enzymes to Protein Engineeringto Computational Protein Design
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40
Computational Protein Design1 Challenges in Protein Engineering
Pablo Carbonellpablocarbonellissbgenopolefr
iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France
mSSB December 2010
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40
Bibliography I
David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232
Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013
Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174
D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]
Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359
James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 8 40
Locating the Substitutions
How to select the best residues to mutate in theparent protein
If detailed structural information on the parentenzyme is available a rational approach canbe applied to the design
When partial information on structure isavailable a semi-rational approach is used
If there is no information available then arandom search is used
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 9 40
Choosing the Right Strategy
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 10 40
Additivity and Cooperativity Effects
Additivity of the effects of substitutions israrely seen when screening mutants
In order to avoid dead ends typically ascreening strategy is designed based onbuilding libraries with simultaneous mutationsin order to find cooperativity effectsTesting for simultaneous mutations comes atthe cost of a larger screening
Natural evolution however has favoredsingle-step mutations beneficial althoughneutral drift in this case has probably allowedfor a larger search in the sequence space Additivitycooperativity experiments searching for high affinity
antibody variants
[Chodorge et al 2008]
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 11 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 12 40
Types of Protein Interactions
Protein-ligand binding(drug-target enzyme-substrate)
Protein-nucleotide(DNARNA) binding)
Protein-peptide interaction Protein-protein interaction
Protein-Protein interactions
Adapted from [Perkins et al 2010]
Protein-protein complexes
homo-oligomeric hetero-oligomericnon-obligate obligate(weak and strong) transient permanent
[Nooren and Thornton 2003]
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 13 40
Protein Specificity and Promiscuity
Multispecificity broad partner specificity(multiple substrates proteins ligands)
Small molecule ligand similar chemicalstructure usually with stereoselectivityProteins or peptides structural similar motifsrather than sequence motifs
Promiscuity the ability to participate in afunction other than the native one(moonlighting)
Allostery regulation of the protein by bindingof some ligand (the effector) at the allostericsite
Lock and key Induced fit
[Fischer 1894] [Koshland 1958]
Conformational selection
[Boehr et al 2009]
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 14 40
Protein Specificity and Promiscuity The Case of PPIs
PPI any physical binding between proteins that occurin vivo in the cellPPI screening methods still have some limitations
Y2H high FP-rateTAP-MS limited scalabilityLuminiscence-based methods proteome chipsco-immunoprecipitation MS real-time analysis (3rdgeneration DNA-seq)
Transient and PTM-dependent interactions are oftenmissed
Biological context developmental stageco-localization protein modifications presence ofcofactors presence of other binding partners
Protein hubs highly connected proteins related toessentiality robustness modularity evolvability Partyand date hubs under debate
single-interface multi-interface
[Kim et al 2006]
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 15 40
Data Sources
Enzymatic activityBRENDA experimental parametersKEGG MetaCyc metabolic networksCatalytic Site Atlas catalytic sites
Data validation and predictionGeneMANIA lists of genes with functionally similar or shared propertiesSTRING based on genomic context HT experiments co-expression literatureComPASS assign confidence to an interaction detected by MS
Primary PPI databasesDIP BioGRID IntAct MINTCommon languages PSICQUIC expression co-localization genetic metabolicsignaling pathways experimental data SBMLBuilding the network Cytoscape
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 16 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 17 40
Overview of Protein Engineering Technology
From a need to adjust enzyme properties for industrial processes
to the challenge of generating novel proteins for therapeutic and biomedicalapplications
GoalsIncreased catalytic function related to the parent
Altered specificity stereospecificity or affinity to interacting partners
Increased stability
Property ParametersThermostability T50
Catalytic activity kcat KM kcatKM
Binding specificity (kcatKM )A(kcatKM )B
Kd KI
Binding affinity Ka = 1Kd
∆G = minusRT ln 1Kd
A paradigm shift in the last 2decades
PCR and recombinant genetechnologies
Recreation of evolution in thelab
Computer algorithms
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 18 40
Goal 1 Increasing the Thermostability
Thermostability quantifies the ability of proteinrsquos secondary and tertiarystructures to withstand high temperatures avoiding denaturation
Thermostability is typically measured experimentally by T50 the temperature atwhich 50 of the proteins are inactivated in 10 minutes
Increasing the thermostability can be considered the first step in proteinengineering in order to make the protein tolerant to a greater range of amino acidsubstitutions
Main design techniquesSequence-based design comparison through multiple alignmentsStructure-based approach assumes that a more rigid protein will be more stable athigh temperatures
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 19 40
Goal 2 Increasing the Catalytic Activity
How to quantify enzyme activity Michaelis-Menten model of kinetics
E + Sk1
kminus1
ES k2
E + P (1)
d [ES]
dt= k1[E ][S]minus [ES](kminus1 + k2) (2)
d [P]
dt= k2[ES] (3)
k2 is also known as kcat or turnover rate (in morecomplex cases kcat is function of several rates)
kcat alone is not enough we need to quantify the affinityof the enzyme to the substrate
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 20 40
Enzyme Kinetics
AssumptionsFirst assumption the concentration of the substrate-bound enzyme [ES] isapproximately constant compared with the rate of change of the concentration ofsubstrate [S] and product [P]
d [ES]
dt= k1[E ][S]minus [ES](kminus1 + k2) asymp 0 (4)
Second assumption the total concentration of enzyme [E ]0 does not changewith time
[E ]0 = [E ] + [ES] asymp const (5)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 21 40
The Michaelis constant KM
0 = k1[S]([E ]0 minus [ES])minus [ES](kminus1 + k2) (6)
k1[S][E ]0 = k1[S][ES] + [ES](kminus1 + k2) (7)
[S][E ]0 = [S][ES] + [ES]kminus1 + k2
k1(8)
(9)
KM Michaelis constant
KM =kminus1 + k2
k1(10)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 22 40
The Michaelis Constant KM and the steady-state flux
Rate of product formation (flux)
d [P]
dt= v = k2[ES] = k2[E ]0
[S]
KM + [S](11)
v =vmax [S]
KM + [S]=
11 + KM
[S]
vmax (12)
KM can be measured as the concentration of substrate [S] that corresponds to aproduct formation yield half of the maximum
v =vmax
2(13)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 23 40
Determining KM from the concentration curve
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 24 40
Evaluating Enzyme Efficiency
kcatKM is often used as a specificity constant to compare relative enzyme ratesof reaction of pairs of substrates transformed by an enzyme
For an enzyme acting simultaneously on two substrates SA SB at rates vA vB
vA
vB=
kAcatK A
M [SA]
kBcatK B
M [SB](14)
At [SA] = [SB] kcatKM provides a measure of substrate promiscuity efficiency
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 25 40
Goal 3 Protein Binding Affinity and Specificity
Proteins can bind to different partners
Protein-ligand binding interaction with a small molecule such as drug-target orenzyme-substrate
Protein-nucleotide (DNARNA) binding in transcription regulation promotersetc
Protein-protein interactionPermanent or obligated in multi-units proteins it could have a structural or functionalroleTransient in signaling transport and regulation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 26 40
31 Protein Binding Affinity
Dissociation constant
A + Bk1
kminus1
AB (15)
d [AB]
dt= k1[A][B]minus kminus1[AB] (16)
In equilibrium
0 = k1[A][B]minus kminus1[AB] (17)
kd =kminus1
k1=
[A][B]
[AB](18)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 27 40
31 Protein Binding Affinity
Affinity constant
ka =1kd
(19)
In antibodies
Ab + Agkforward
kback
AbAg (20)
Binding free energy
∆G = minusRT ln ka = minusRT ln1kd
(21)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 28 40
Simplified Thermodynamics of an Enzymatic Reaction
[Jonas and Hollfelder in Protein Engineering Handbook (2009)]
Ground-state binding (KM )
Transition-state binding (Ktx )
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 29 40
32 Protein Binding Specificity
These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation
Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners
KI inhibition constant When an inhibitor competes with a ligand
Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands
Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs
Promiscuity the ability to participate n a function other than the native one
Allostery regulation of a protein by binding of some ligand (the effector)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40
Thermodynamics of a Reaction with 2 Competing Substrates
[Desari and Miller in Protein Engineering Handbook (2009)]
Specificity reflects differences in the absolute heights of the transition states
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40
Introducing the Substitutions
Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point
mutation deletion or insertion) is annealed to the targetregion
4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template
5 The heteroduplex is propagated by transformation in Ecoli
Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40
Recombination and DNA-shuffling
A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology
DNA-shuffling to perform functionaldomain or motif shuffling in vitro
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40
Recombinant Protein Folding
E coli is a typically first choice for expressing a heterologous protein
However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli
Some misfolding-related issues
Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases
The environment (crowding pH osmolarity etc)
Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)
Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis
E coli expressing human leptin as
inclusion body
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40
Directed Evolution
A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold
Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities
An iterative process
Identifying a good starting sequence usually containing some level of latentpromiscuity
Creation of a library of variants
Selecting variants with improved function (mutation and screening)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40
From Natural Enzymes to Protein Engineeringto Computational Protein Design
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40
Computational Protein Design1 Challenges in Protein Engineering
Pablo Carbonellpablocarbonellissbgenopolefr
iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France
mSSB December 2010
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40
Bibliography I
David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232
Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013
Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174
D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]
Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359
James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40
Locating the Substitutions
How to select the best residues to mutate in theparent protein
If detailed structural information on the parentenzyme is available a rational approach canbe applied to the design
When partial information on structure isavailable a semi-rational approach is used
If there is no information available then arandom search is used
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 9 40
Choosing the Right Strategy
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 10 40
Additivity and Cooperativity Effects
Additivity of the effects of substitutions israrely seen when screening mutants
In order to avoid dead ends typically ascreening strategy is designed based onbuilding libraries with simultaneous mutationsin order to find cooperativity effectsTesting for simultaneous mutations comes atthe cost of a larger screening
Natural evolution however has favoredsingle-step mutations beneficial althoughneutral drift in this case has probably allowedfor a larger search in the sequence space Additivitycooperativity experiments searching for high affinity
antibody variants
[Chodorge et al 2008]
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 11 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 12 40
Types of Protein Interactions
Protein-ligand binding(drug-target enzyme-substrate)
Protein-nucleotide(DNARNA) binding)
Protein-peptide interaction Protein-protein interaction
Protein-Protein interactions
Adapted from [Perkins et al 2010]
Protein-protein complexes
homo-oligomeric hetero-oligomericnon-obligate obligate(weak and strong) transient permanent
[Nooren and Thornton 2003]
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 13 40
Protein Specificity and Promiscuity
Multispecificity broad partner specificity(multiple substrates proteins ligands)
Small molecule ligand similar chemicalstructure usually with stereoselectivityProteins or peptides structural similar motifsrather than sequence motifs
Promiscuity the ability to participate in afunction other than the native one(moonlighting)
Allostery regulation of the protein by bindingof some ligand (the effector) at the allostericsite
Lock and key Induced fit
[Fischer 1894] [Koshland 1958]
Conformational selection
[Boehr et al 2009]
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 14 40
Protein Specificity and Promiscuity The Case of PPIs
PPI any physical binding between proteins that occurin vivo in the cellPPI screening methods still have some limitations
Y2H high FP-rateTAP-MS limited scalabilityLuminiscence-based methods proteome chipsco-immunoprecipitation MS real-time analysis (3rdgeneration DNA-seq)
Transient and PTM-dependent interactions are oftenmissed
Biological context developmental stageco-localization protein modifications presence ofcofactors presence of other binding partners
Protein hubs highly connected proteins related toessentiality robustness modularity evolvability Partyand date hubs under debate
single-interface multi-interface
[Kim et al 2006]
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 15 40
Data Sources
Enzymatic activityBRENDA experimental parametersKEGG MetaCyc metabolic networksCatalytic Site Atlas catalytic sites
Data validation and predictionGeneMANIA lists of genes with functionally similar or shared propertiesSTRING based on genomic context HT experiments co-expression literatureComPASS assign confidence to an interaction detected by MS
Primary PPI databasesDIP BioGRID IntAct MINTCommon languages PSICQUIC expression co-localization genetic metabolicsignaling pathways experimental data SBMLBuilding the network Cytoscape
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 16 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 17 40
Overview of Protein Engineering Technology
From a need to adjust enzyme properties for industrial processes
to the challenge of generating novel proteins for therapeutic and biomedicalapplications
GoalsIncreased catalytic function related to the parent
Altered specificity stereospecificity or affinity to interacting partners
Increased stability
Property ParametersThermostability T50
Catalytic activity kcat KM kcatKM
Binding specificity (kcatKM )A(kcatKM )B
Kd KI
Binding affinity Ka = 1Kd
∆G = minusRT ln 1Kd
A paradigm shift in the last 2decades
PCR and recombinant genetechnologies
Recreation of evolution in thelab
Computer algorithms
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 18 40
Goal 1 Increasing the Thermostability
Thermostability quantifies the ability of proteinrsquos secondary and tertiarystructures to withstand high temperatures avoiding denaturation
Thermostability is typically measured experimentally by T50 the temperature atwhich 50 of the proteins are inactivated in 10 minutes
Increasing the thermostability can be considered the first step in proteinengineering in order to make the protein tolerant to a greater range of amino acidsubstitutions
Main design techniquesSequence-based design comparison through multiple alignmentsStructure-based approach assumes that a more rigid protein will be more stable athigh temperatures
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 19 40
Goal 2 Increasing the Catalytic Activity
How to quantify enzyme activity Michaelis-Menten model of kinetics
E + Sk1
kminus1
ES k2
E + P (1)
d [ES]
dt= k1[E ][S]minus [ES](kminus1 + k2) (2)
d [P]
dt= k2[ES] (3)
k2 is also known as kcat or turnover rate (in morecomplex cases kcat is function of several rates)
kcat alone is not enough we need to quantify the affinityof the enzyme to the substrate
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 20 40
Enzyme Kinetics
AssumptionsFirst assumption the concentration of the substrate-bound enzyme [ES] isapproximately constant compared with the rate of change of the concentration ofsubstrate [S] and product [P]
d [ES]
dt= k1[E ][S]minus [ES](kminus1 + k2) asymp 0 (4)
Second assumption the total concentration of enzyme [E ]0 does not changewith time
[E ]0 = [E ] + [ES] asymp const (5)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 21 40
The Michaelis constant KM
0 = k1[S]([E ]0 minus [ES])minus [ES](kminus1 + k2) (6)
k1[S][E ]0 = k1[S][ES] + [ES](kminus1 + k2) (7)
[S][E ]0 = [S][ES] + [ES]kminus1 + k2
k1(8)
(9)
KM Michaelis constant
KM =kminus1 + k2
k1(10)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 22 40
The Michaelis Constant KM and the steady-state flux
Rate of product formation (flux)
d [P]
dt= v = k2[ES] = k2[E ]0
[S]
KM + [S](11)
v =vmax [S]
KM + [S]=
11 + KM
[S]
vmax (12)
KM can be measured as the concentration of substrate [S] that corresponds to aproduct formation yield half of the maximum
v =vmax
2(13)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 23 40
Determining KM from the concentration curve
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 24 40
Evaluating Enzyme Efficiency
kcatKM is often used as a specificity constant to compare relative enzyme ratesof reaction of pairs of substrates transformed by an enzyme
For an enzyme acting simultaneously on two substrates SA SB at rates vA vB
vA
vB=
kAcatK A
M [SA]
kBcatK B
M [SB](14)
At [SA] = [SB] kcatKM provides a measure of substrate promiscuity efficiency
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 25 40
Goal 3 Protein Binding Affinity and Specificity
Proteins can bind to different partners
Protein-ligand binding interaction with a small molecule such as drug-target orenzyme-substrate
Protein-nucleotide (DNARNA) binding in transcription regulation promotersetc
Protein-protein interactionPermanent or obligated in multi-units proteins it could have a structural or functionalroleTransient in signaling transport and regulation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 26 40
31 Protein Binding Affinity
Dissociation constant
A + Bk1
kminus1
AB (15)
d [AB]
dt= k1[A][B]minus kminus1[AB] (16)
In equilibrium
0 = k1[A][B]minus kminus1[AB] (17)
kd =kminus1
k1=
[A][B]
[AB](18)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 27 40
31 Protein Binding Affinity
Affinity constant
ka =1kd
(19)
In antibodies
Ab + Agkforward
kback
AbAg (20)
Binding free energy
∆G = minusRT ln ka = minusRT ln1kd
(21)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 28 40
Simplified Thermodynamics of an Enzymatic Reaction
[Jonas and Hollfelder in Protein Engineering Handbook (2009)]
Ground-state binding (KM )
Transition-state binding (Ktx )
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 29 40
32 Protein Binding Specificity
These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation
Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners
KI inhibition constant When an inhibitor competes with a ligand
Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands
Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs
Promiscuity the ability to participate n a function other than the native one
Allostery regulation of a protein by binding of some ligand (the effector)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40
Thermodynamics of a Reaction with 2 Competing Substrates
[Desari and Miller in Protein Engineering Handbook (2009)]
Specificity reflects differences in the absolute heights of the transition states
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40
Introducing the Substitutions
Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point
mutation deletion or insertion) is annealed to the targetregion
4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template
5 The heteroduplex is propagated by transformation in Ecoli
Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40
Recombination and DNA-shuffling
A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology
DNA-shuffling to perform functionaldomain or motif shuffling in vitro
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40
Recombinant Protein Folding
E coli is a typically first choice for expressing a heterologous protein
However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli
Some misfolding-related issues
Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases
The environment (crowding pH osmolarity etc)
Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)
Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis
E coli expressing human leptin as
inclusion body
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40
Directed Evolution
A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold
Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities
An iterative process
Identifying a good starting sequence usually containing some level of latentpromiscuity
Creation of a library of variants
Selecting variants with improved function (mutation and screening)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40
From Natural Enzymes to Protein Engineeringto Computational Protein Design
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40
Computational Protein Design1 Challenges in Protein Engineering
Pablo Carbonellpablocarbonellissbgenopolefr
iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France
mSSB December 2010
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40
Bibliography I
David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232
Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013
Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174
D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]
Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359
James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40
Choosing the Right Strategy
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 10 40
Additivity and Cooperativity Effects
Additivity of the effects of substitutions israrely seen when screening mutants
In order to avoid dead ends typically ascreening strategy is designed based onbuilding libraries with simultaneous mutationsin order to find cooperativity effectsTesting for simultaneous mutations comes atthe cost of a larger screening
Natural evolution however has favoredsingle-step mutations beneficial althoughneutral drift in this case has probably allowedfor a larger search in the sequence space Additivitycooperativity experiments searching for high affinity
antibody variants
[Chodorge et al 2008]
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 11 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 12 40
Types of Protein Interactions
Protein-ligand binding(drug-target enzyme-substrate)
Protein-nucleotide(DNARNA) binding)
Protein-peptide interaction Protein-protein interaction
Protein-Protein interactions
Adapted from [Perkins et al 2010]
Protein-protein complexes
homo-oligomeric hetero-oligomericnon-obligate obligate(weak and strong) transient permanent
[Nooren and Thornton 2003]
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 13 40
Protein Specificity and Promiscuity
Multispecificity broad partner specificity(multiple substrates proteins ligands)
Small molecule ligand similar chemicalstructure usually with stereoselectivityProteins or peptides structural similar motifsrather than sequence motifs
Promiscuity the ability to participate in afunction other than the native one(moonlighting)
Allostery regulation of the protein by bindingof some ligand (the effector) at the allostericsite
Lock and key Induced fit
[Fischer 1894] [Koshland 1958]
Conformational selection
[Boehr et al 2009]
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 14 40
Protein Specificity and Promiscuity The Case of PPIs
PPI any physical binding between proteins that occurin vivo in the cellPPI screening methods still have some limitations
Y2H high FP-rateTAP-MS limited scalabilityLuminiscence-based methods proteome chipsco-immunoprecipitation MS real-time analysis (3rdgeneration DNA-seq)
Transient and PTM-dependent interactions are oftenmissed
Biological context developmental stageco-localization protein modifications presence ofcofactors presence of other binding partners
Protein hubs highly connected proteins related toessentiality robustness modularity evolvability Partyand date hubs under debate
single-interface multi-interface
[Kim et al 2006]
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 15 40
Data Sources
Enzymatic activityBRENDA experimental parametersKEGG MetaCyc metabolic networksCatalytic Site Atlas catalytic sites
Data validation and predictionGeneMANIA lists of genes with functionally similar or shared propertiesSTRING based on genomic context HT experiments co-expression literatureComPASS assign confidence to an interaction detected by MS
Primary PPI databasesDIP BioGRID IntAct MINTCommon languages PSICQUIC expression co-localization genetic metabolicsignaling pathways experimental data SBMLBuilding the network Cytoscape
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 16 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 17 40
Overview of Protein Engineering Technology
From a need to adjust enzyme properties for industrial processes
to the challenge of generating novel proteins for therapeutic and biomedicalapplications
GoalsIncreased catalytic function related to the parent
Altered specificity stereospecificity or affinity to interacting partners
Increased stability
Property ParametersThermostability T50
Catalytic activity kcat KM kcatKM
Binding specificity (kcatKM )A(kcatKM )B
Kd KI
Binding affinity Ka = 1Kd
∆G = minusRT ln 1Kd
A paradigm shift in the last 2decades
PCR and recombinant genetechnologies
Recreation of evolution in thelab
Computer algorithms
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 18 40
Goal 1 Increasing the Thermostability
Thermostability quantifies the ability of proteinrsquos secondary and tertiarystructures to withstand high temperatures avoiding denaturation
Thermostability is typically measured experimentally by T50 the temperature atwhich 50 of the proteins are inactivated in 10 minutes
Increasing the thermostability can be considered the first step in proteinengineering in order to make the protein tolerant to a greater range of amino acidsubstitutions
Main design techniquesSequence-based design comparison through multiple alignmentsStructure-based approach assumes that a more rigid protein will be more stable athigh temperatures
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 19 40
Goal 2 Increasing the Catalytic Activity
How to quantify enzyme activity Michaelis-Menten model of kinetics
E + Sk1
kminus1
ES k2
E + P (1)
d [ES]
dt= k1[E ][S]minus [ES](kminus1 + k2) (2)
d [P]
dt= k2[ES] (3)
k2 is also known as kcat or turnover rate (in morecomplex cases kcat is function of several rates)
kcat alone is not enough we need to quantify the affinityof the enzyme to the substrate
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 20 40
Enzyme Kinetics
AssumptionsFirst assumption the concentration of the substrate-bound enzyme [ES] isapproximately constant compared with the rate of change of the concentration ofsubstrate [S] and product [P]
d [ES]
dt= k1[E ][S]minus [ES](kminus1 + k2) asymp 0 (4)
Second assumption the total concentration of enzyme [E ]0 does not changewith time
[E ]0 = [E ] + [ES] asymp const (5)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 21 40
The Michaelis constant KM
0 = k1[S]([E ]0 minus [ES])minus [ES](kminus1 + k2) (6)
k1[S][E ]0 = k1[S][ES] + [ES](kminus1 + k2) (7)
[S][E ]0 = [S][ES] + [ES]kminus1 + k2
k1(8)
(9)
KM Michaelis constant
KM =kminus1 + k2
k1(10)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 22 40
The Michaelis Constant KM and the steady-state flux
Rate of product formation (flux)
d [P]
dt= v = k2[ES] = k2[E ]0
[S]
KM + [S](11)
v =vmax [S]
KM + [S]=
11 + KM
[S]
vmax (12)
KM can be measured as the concentration of substrate [S] that corresponds to aproduct formation yield half of the maximum
v =vmax
2(13)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 23 40
Determining KM from the concentration curve
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 24 40
Evaluating Enzyme Efficiency
kcatKM is often used as a specificity constant to compare relative enzyme ratesof reaction of pairs of substrates transformed by an enzyme
For an enzyme acting simultaneously on two substrates SA SB at rates vA vB
vA
vB=
kAcatK A
M [SA]
kBcatK B
M [SB](14)
At [SA] = [SB] kcatKM provides a measure of substrate promiscuity efficiency
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 25 40
Goal 3 Protein Binding Affinity and Specificity
Proteins can bind to different partners
Protein-ligand binding interaction with a small molecule such as drug-target orenzyme-substrate
Protein-nucleotide (DNARNA) binding in transcription regulation promotersetc
Protein-protein interactionPermanent or obligated in multi-units proteins it could have a structural or functionalroleTransient in signaling transport and regulation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 26 40
31 Protein Binding Affinity
Dissociation constant
A + Bk1
kminus1
AB (15)
d [AB]
dt= k1[A][B]minus kminus1[AB] (16)
In equilibrium
0 = k1[A][B]minus kminus1[AB] (17)
kd =kminus1
k1=
[A][B]
[AB](18)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 27 40
31 Protein Binding Affinity
Affinity constant
ka =1kd
(19)
In antibodies
Ab + Agkforward
kback
AbAg (20)
Binding free energy
∆G = minusRT ln ka = minusRT ln1kd
(21)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 28 40
Simplified Thermodynamics of an Enzymatic Reaction
[Jonas and Hollfelder in Protein Engineering Handbook (2009)]
Ground-state binding (KM )
Transition-state binding (Ktx )
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 29 40
32 Protein Binding Specificity
These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation
Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners
KI inhibition constant When an inhibitor competes with a ligand
Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands
Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs
Promiscuity the ability to participate n a function other than the native one
Allostery regulation of a protein by binding of some ligand (the effector)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40
Thermodynamics of a Reaction with 2 Competing Substrates
[Desari and Miller in Protein Engineering Handbook (2009)]
Specificity reflects differences in the absolute heights of the transition states
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40
Introducing the Substitutions
Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point
mutation deletion or insertion) is annealed to the targetregion
4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template
5 The heteroduplex is propagated by transformation in Ecoli
Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40
Recombination and DNA-shuffling
A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology
DNA-shuffling to perform functionaldomain or motif shuffling in vitro
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40
Recombinant Protein Folding
E coli is a typically first choice for expressing a heterologous protein
However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli
Some misfolding-related issues
Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases
The environment (crowding pH osmolarity etc)
Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)
Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis
E coli expressing human leptin as
inclusion body
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40
Directed Evolution
A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold
Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities
An iterative process
Identifying a good starting sequence usually containing some level of latentpromiscuity
Creation of a library of variants
Selecting variants with improved function (mutation and screening)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40
From Natural Enzymes to Protein Engineeringto Computational Protein Design
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40
Computational Protein Design1 Challenges in Protein Engineering
Pablo Carbonellpablocarbonellissbgenopolefr
iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France
mSSB December 2010
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40
Bibliography I
David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232
Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013
Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174
D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]
Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359
James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40
Additivity and Cooperativity Effects
Additivity of the effects of substitutions israrely seen when screening mutants
In order to avoid dead ends typically ascreening strategy is designed based onbuilding libraries with simultaneous mutationsin order to find cooperativity effectsTesting for simultaneous mutations comes atthe cost of a larger screening
Natural evolution however has favoredsingle-step mutations beneficial althoughneutral drift in this case has probably allowedfor a larger search in the sequence space Additivitycooperativity experiments searching for high affinity
antibody variants
[Chodorge et al 2008]
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 11 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 12 40
Types of Protein Interactions
Protein-ligand binding(drug-target enzyme-substrate)
Protein-nucleotide(DNARNA) binding)
Protein-peptide interaction Protein-protein interaction
Protein-Protein interactions
Adapted from [Perkins et al 2010]
Protein-protein complexes
homo-oligomeric hetero-oligomericnon-obligate obligate(weak and strong) transient permanent
[Nooren and Thornton 2003]
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 13 40
Protein Specificity and Promiscuity
Multispecificity broad partner specificity(multiple substrates proteins ligands)
Small molecule ligand similar chemicalstructure usually with stereoselectivityProteins or peptides structural similar motifsrather than sequence motifs
Promiscuity the ability to participate in afunction other than the native one(moonlighting)
Allostery regulation of the protein by bindingof some ligand (the effector) at the allostericsite
Lock and key Induced fit
[Fischer 1894] [Koshland 1958]
Conformational selection
[Boehr et al 2009]
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 14 40
Protein Specificity and Promiscuity The Case of PPIs
PPI any physical binding between proteins that occurin vivo in the cellPPI screening methods still have some limitations
Y2H high FP-rateTAP-MS limited scalabilityLuminiscence-based methods proteome chipsco-immunoprecipitation MS real-time analysis (3rdgeneration DNA-seq)
Transient and PTM-dependent interactions are oftenmissed
Biological context developmental stageco-localization protein modifications presence ofcofactors presence of other binding partners
Protein hubs highly connected proteins related toessentiality robustness modularity evolvability Partyand date hubs under debate
single-interface multi-interface
[Kim et al 2006]
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 15 40
Data Sources
Enzymatic activityBRENDA experimental parametersKEGG MetaCyc metabolic networksCatalytic Site Atlas catalytic sites
Data validation and predictionGeneMANIA lists of genes with functionally similar or shared propertiesSTRING based on genomic context HT experiments co-expression literatureComPASS assign confidence to an interaction detected by MS
Primary PPI databasesDIP BioGRID IntAct MINTCommon languages PSICQUIC expression co-localization genetic metabolicsignaling pathways experimental data SBMLBuilding the network Cytoscape
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 16 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 17 40
Overview of Protein Engineering Technology
From a need to adjust enzyme properties for industrial processes
to the challenge of generating novel proteins for therapeutic and biomedicalapplications
GoalsIncreased catalytic function related to the parent
Altered specificity stereospecificity or affinity to interacting partners
Increased stability
Property ParametersThermostability T50
Catalytic activity kcat KM kcatKM
Binding specificity (kcatKM )A(kcatKM )B
Kd KI
Binding affinity Ka = 1Kd
∆G = minusRT ln 1Kd
A paradigm shift in the last 2decades
PCR and recombinant genetechnologies
Recreation of evolution in thelab
Computer algorithms
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 18 40
Goal 1 Increasing the Thermostability
Thermostability quantifies the ability of proteinrsquos secondary and tertiarystructures to withstand high temperatures avoiding denaturation
Thermostability is typically measured experimentally by T50 the temperature atwhich 50 of the proteins are inactivated in 10 minutes
Increasing the thermostability can be considered the first step in proteinengineering in order to make the protein tolerant to a greater range of amino acidsubstitutions
Main design techniquesSequence-based design comparison through multiple alignmentsStructure-based approach assumes that a more rigid protein will be more stable athigh temperatures
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 19 40
Goal 2 Increasing the Catalytic Activity
How to quantify enzyme activity Michaelis-Menten model of kinetics
E + Sk1
kminus1
ES k2
E + P (1)
d [ES]
dt= k1[E ][S]minus [ES](kminus1 + k2) (2)
d [P]
dt= k2[ES] (3)
k2 is also known as kcat or turnover rate (in morecomplex cases kcat is function of several rates)
kcat alone is not enough we need to quantify the affinityof the enzyme to the substrate
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 20 40
Enzyme Kinetics
AssumptionsFirst assumption the concentration of the substrate-bound enzyme [ES] isapproximately constant compared with the rate of change of the concentration ofsubstrate [S] and product [P]
d [ES]
dt= k1[E ][S]minus [ES](kminus1 + k2) asymp 0 (4)
Second assumption the total concentration of enzyme [E ]0 does not changewith time
[E ]0 = [E ] + [ES] asymp const (5)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 21 40
The Michaelis constant KM
0 = k1[S]([E ]0 minus [ES])minus [ES](kminus1 + k2) (6)
k1[S][E ]0 = k1[S][ES] + [ES](kminus1 + k2) (7)
[S][E ]0 = [S][ES] + [ES]kminus1 + k2
k1(8)
(9)
KM Michaelis constant
KM =kminus1 + k2
k1(10)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 22 40
The Michaelis Constant KM and the steady-state flux
Rate of product formation (flux)
d [P]
dt= v = k2[ES] = k2[E ]0
[S]
KM + [S](11)
v =vmax [S]
KM + [S]=
11 + KM
[S]
vmax (12)
KM can be measured as the concentration of substrate [S] that corresponds to aproduct formation yield half of the maximum
v =vmax
2(13)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 23 40
Determining KM from the concentration curve
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 24 40
Evaluating Enzyme Efficiency
kcatKM is often used as a specificity constant to compare relative enzyme ratesof reaction of pairs of substrates transformed by an enzyme
For an enzyme acting simultaneously on two substrates SA SB at rates vA vB
vA
vB=
kAcatK A
M [SA]
kBcatK B
M [SB](14)
At [SA] = [SB] kcatKM provides a measure of substrate promiscuity efficiency
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 25 40
Goal 3 Protein Binding Affinity and Specificity
Proteins can bind to different partners
Protein-ligand binding interaction with a small molecule such as drug-target orenzyme-substrate
Protein-nucleotide (DNARNA) binding in transcription regulation promotersetc
Protein-protein interactionPermanent or obligated in multi-units proteins it could have a structural or functionalroleTransient in signaling transport and regulation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 26 40
31 Protein Binding Affinity
Dissociation constant
A + Bk1
kminus1
AB (15)
d [AB]
dt= k1[A][B]minus kminus1[AB] (16)
In equilibrium
0 = k1[A][B]minus kminus1[AB] (17)
kd =kminus1
k1=
[A][B]
[AB](18)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 27 40
31 Protein Binding Affinity
Affinity constant
ka =1kd
(19)
In antibodies
Ab + Agkforward
kback
AbAg (20)
Binding free energy
∆G = minusRT ln ka = minusRT ln1kd
(21)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 28 40
Simplified Thermodynamics of an Enzymatic Reaction
[Jonas and Hollfelder in Protein Engineering Handbook (2009)]
Ground-state binding (KM )
Transition-state binding (Ktx )
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 29 40
32 Protein Binding Specificity
These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation
Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners
KI inhibition constant When an inhibitor competes with a ligand
Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands
Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs
Promiscuity the ability to participate n a function other than the native one
Allostery regulation of a protein by binding of some ligand (the effector)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40
Thermodynamics of a Reaction with 2 Competing Substrates
[Desari and Miller in Protein Engineering Handbook (2009)]
Specificity reflects differences in the absolute heights of the transition states
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40
Introducing the Substitutions
Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point
mutation deletion or insertion) is annealed to the targetregion
4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template
5 The heteroduplex is propagated by transformation in Ecoli
Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40
Recombination and DNA-shuffling
A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology
DNA-shuffling to perform functionaldomain or motif shuffling in vitro
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40
Recombinant Protein Folding
E coli is a typically first choice for expressing a heterologous protein
However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli
Some misfolding-related issues
Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases
The environment (crowding pH osmolarity etc)
Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)
Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis
E coli expressing human leptin as
inclusion body
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40
Directed Evolution
A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold
Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities
An iterative process
Identifying a good starting sequence usually containing some level of latentpromiscuity
Creation of a library of variants
Selecting variants with improved function (mutation and screening)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40
From Natural Enzymes to Protein Engineeringto Computational Protein Design
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40
Computational Protein Design1 Challenges in Protein Engineering
Pablo Carbonellpablocarbonellissbgenopolefr
iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France
mSSB December 2010
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40
Bibliography I
David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232
Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013
Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174
D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]
Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359
James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 12 40
Types of Protein Interactions
Protein-ligand binding(drug-target enzyme-substrate)
Protein-nucleotide(DNARNA) binding)
Protein-peptide interaction Protein-protein interaction
Protein-Protein interactions
Adapted from [Perkins et al 2010]
Protein-protein complexes
homo-oligomeric hetero-oligomericnon-obligate obligate(weak and strong) transient permanent
[Nooren and Thornton 2003]
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 13 40
Protein Specificity and Promiscuity
Multispecificity broad partner specificity(multiple substrates proteins ligands)
Small molecule ligand similar chemicalstructure usually with stereoselectivityProteins or peptides structural similar motifsrather than sequence motifs
Promiscuity the ability to participate in afunction other than the native one(moonlighting)
Allostery regulation of the protein by bindingof some ligand (the effector) at the allostericsite
Lock and key Induced fit
[Fischer 1894] [Koshland 1958]
Conformational selection
[Boehr et al 2009]
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 14 40
Protein Specificity and Promiscuity The Case of PPIs
PPI any physical binding between proteins that occurin vivo in the cellPPI screening methods still have some limitations
Y2H high FP-rateTAP-MS limited scalabilityLuminiscence-based methods proteome chipsco-immunoprecipitation MS real-time analysis (3rdgeneration DNA-seq)
Transient and PTM-dependent interactions are oftenmissed
Biological context developmental stageco-localization protein modifications presence ofcofactors presence of other binding partners
Protein hubs highly connected proteins related toessentiality robustness modularity evolvability Partyand date hubs under debate
single-interface multi-interface
[Kim et al 2006]
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 15 40
Data Sources
Enzymatic activityBRENDA experimental parametersKEGG MetaCyc metabolic networksCatalytic Site Atlas catalytic sites
Data validation and predictionGeneMANIA lists of genes with functionally similar or shared propertiesSTRING based on genomic context HT experiments co-expression literatureComPASS assign confidence to an interaction detected by MS
Primary PPI databasesDIP BioGRID IntAct MINTCommon languages PSICQUIC expression co-localization genetic metabolicsignaling pathways experimental data SBMLBuilding the network Cytoscape
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 16 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 17 40
Overview of Protein Engineering Technology
From a need to adjust enzyme properties for industrial processes
to the challenge of generating novel proteins for therapeutic and biomedicalapplications
GoalsIncreased catalytic function related to the parent
Altered specificity stereospecificity or affinity to interacting partners
Increased stability
Property ParametersThermostability T50
Catalytic activity kcat KM kcatKM
Binding specificity (kcatKM )A(kcatKM )B
Kd KI
Binding affinity Ka = 1Kd
∆G = minusRT ln 1Kd
A paradigm shift in the last 2decades
PCR and recombinant genetechnologies
Recreation of evolution in thelab
Computer algorithms
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 18 40
Goal 1 Increasing the Thermostability
Thermostability quantifies the ability of proteinrsquos secondary and tertiarystructures to withstand high temperatures avoiding denaturation
Thermostability is typically measured experimentally by T50 the temperature atwhich 50 of the proteins are inactivated in 10 minutes
Increasing the thermostability can be considered the first step in proteinengineering in order to make the protein tolerant to a greater range of amino acidsubstitutions
Main design techniquesSequence-based design comparison through multiple alignmentsStructure-based approach assumes that a more rigid protein will be more stable athigh temperatures
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 19 40
Goal 2 Increasing the Catalytic Activity
How to quantify enzyme activity Michaelis-Menten model of kinetics
E + Sk1
kminus1
ES k2
E + P (1)
d [ES]
dt= k1[E ][S]minus [ES](kminus1 + k2) (2)
d [P]
dt= k2[ES] (3)
k2 is also known as kcat or turnover rate (in morecomplex cases kcat is function of several rates)
kcat alone is not enough we need to quantify the affinityof the enzyme to the substrate
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 20 40
Enzyme Kinetics
AssumptionsFirst assumption the concentration of the substrate-bound enzyme [ES] isapproximately constant compared with the rate of change of the concentration ofsubstrate [S] and product [P]
d [ES]
dt= k1[E ][S]minus [ES](kminus1 + k2) asymp 0 (4)
Second assumption the total concentration of enzyme [E ]0 does not changewith time
[E ]0 = [E ] + [ES] asymp const (5)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 21 40
The Michaelis constant KM
0 = k1[S]([E ]0 minus [ES])minus [ES](kminus1 + k2) (6)
k1[S][E ]0 = k1[S][ES] + [ES](kminus1 + k2) (7)
[S][E ]0 = [S][ES] + [ES]kminus1 + k2
k1(8)
(9)
KM Michaelis constant
KM =kminus1 + k2
k1(10)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 22 40
The Michaelis Constant KM and the steady-state flux
Rate of product formation (flux)
d [P]
dt= v = k2[ES] = k2[E ]0
[S]
KM + [S](11)
v =vmax [S]
KM + [S]=
11 + KM
[S]
vmax (12)
KM can be measured as the concentration of substrate [S] that corresponds to aproduct formation yield half of the maximum
v =vmax
2(13)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 23 40
Determining KM from the concentration curve
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 24 40
Evaluating Enzyme Efficiency
kcatKM is often used as a specificity constant to compare relative enzyme ratesof reaction of pairs of substrates transformed by an enzyme
For an enzyme acting simultaneously on two substrates SA SB at rates vA vB
vA
vB=
kAcatK A
M [SA]
kBcatK B
M [SB](14)
At [SA] = [SB] kcatKM provides a measure of substrate promiscuity efficiency
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 25 40
Goal 3 Protein Binding Affinity and Specificity
Proteins can bind to different partners
Protein-ligand binding interaction with a small molecule such as drug-target orenzyme-substrate
Protein-nucleotide (DNARNA) binding in transcription regulation promotersetc
Protein-protein interactionPermanent or obligated in multi-units proteins it could have a structural or functionalroleTransient in signaling transport and regulation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 26 40
31 Protein Binding Affinity
Dissociation constant
A + Bk1
kminus1
AB (15)
d [AB]
dt= k1[A][B]minus kminus1[AB] (16)
In equilibrium
0 = k1[A][B]minus kminus1[AB] (17)
kd =kminus1
k1=
[A][B]
[AB](18)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 27 40
31 Protein Binding Affinity
Affinity constant
ka =1kd
(19)
In antibodies
Ab + Agkforward
kback
AbAg (20)
Binding free energy
∆G = minusRT ln ka = minusRT ln1kd
(21)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 28 40
Simplified Thermodynamics of an Enzymatic Reaction
[Jonas and Hollfelder in Protein Engineering Handbook (2009)]
Ground-state binding (KM )
Transition-state binding (Ktx )
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 29 40
32 Protein Binding Specificity
These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation
Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners
KI inhibition constant When an inhibitor competes with a ligand
Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands
Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs
Promiscuity the ability to participate n a function other than the native one
Allostery regulation of a protein by binding of some ligand (the effector)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40
Thermodynamics of a Reaction with 2 Competing Substrates
[Desari and Miller in Protein Engineering Handbook (2009)]
Specificity reflects differences in the absolute heights of the transition states
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40
Introducing the Substitutions
Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point
mutation deletion or insertion) is annealed to the targetregion
4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template
5 The heteroduplex is propagated by transformation in Ecoli
Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40
Recombination and DNA-shuffling
A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology
DNA-shuffling to perform functionaldomain or motif shuffling in vitro
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40
Recombinant Protein Folding
E coli is a typically first choice for expressing a heterologous protein
However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli
Some misfolding-related issues
Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases
The environment (crowding pH osmolarity etc)
Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)
Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis
E coli expressing human leptin as
inclusion body
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40
Directed Evolution
A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold
Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities
An iterative process
Identifying a good starting sequence usually containing some level of latentpromiscuity
Creation of a library of variants
Selecting variants with improved function (mutation and screening)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40
From Natural Enzymes to Protein Engineeringto Computational Protein Design
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40
Computational Protein Design1 Challenges in Protein Engineering
Pablo Carbonellpablocarbonellissbgenopolefr
iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France
mSSB December 2010
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40
Bibliography I
David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232
Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013
Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174
D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]
Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359
James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40
Types of Protein Interactions
Protein-ligand binding(drug-target enzyme-substrate)
Protein-nucleotide(DNARNA) binding)
Protein-peptide interaction Protein-protein interaction
Protein-Protein interactions
Adapted from [Perkins et al 2010]
Protein-protein complexes
homo-oligomeric hetero-oligomericnon-obligate obligate(weak and strong) transient permanent
[Nooren and Thornton 2003]
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 13 40
Protein Specificity and Promiscuity
Multispecificity broad partner specificity(multiple substrates proteins ligands)
Small molecule ligand similar chemicalstructure usually with stereoselectivityProteins or peptides structural similar motifsrather than sequence motifs
Promiscuity the ability to participate in afunction other than the native one(moonlighting)
Allostery regulation of the protein by bindingof some ligand (the effector) at the allostericsite
Lock and key Induced fit
[Fischer 1894] [Koshland 1958]
Conformational selection
[Boehr et al 2009]
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 14 40
Protein Specificity and Promiscuity The Case of PPIs
PPI any physical binding between proteins that occurin vivo in the cellPPI screening methods still have some limitations
Y2H high FP-rateTAP-MS limited scalabilityLuminiscence-based methods proteome chipsco-immunoprecipitation MS real-time analysis (3rdgeneration DNA-seq)
Transient and PTM-dependent interactions are oftenmissed
Biological context developmental stageco-localization protein modifications presence ofcofactors presence of other binding partners
Protein hubs highly connected proteins related toessentiality robustness modularity evolvability Partyand date hubs under debate
single-interface multi-interface
[Kim et al 2006]
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 15 40
Data Sources
Enzymatic activityBRENDA experimental parametersKEGG MetaCyc metabolic networksCatalytic Site Atlas catalytic sites
Data validation and predictionGeneMANIA lists of genes with functionally similar or shared propertiesSTRING based on genomic context HT experiments co-expression literatureComPASS assign confidence to an interaction detected by MS
Primary PPI databasesDIP BioGRID IntAct MINTCommon languages PSICQUIC expression co-localization genetic metabolicsignaling pathways experimental data SBMLBuilding the network Cytoscape
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 16 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 17 40
Overview of Protein Engineering Technology
From a need to adjust enzyme properties for industrial processes
to the challenge of generating novel proteins for therapeutic and biomedicalapplications
GoalsIncreased catalytic function related to the parent
Altered specificity stereospecificity or affinity to interacting partners
Increased stability
Property ParametersThermostability T50
Catalytic activity kcat KM kcatKM
Binding specificity (kcatKM )A(kcatKM )B
Kd KI
Binding affinity Ka = 1Kd
∆G = minusRT ln 1Kd
A paradigm shift in the last 2decades
PCR and recombinant genetechnologies
Recreation of evolution in thelab
Computer algorithms
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 18 40
Goal 1 Increasing the Thermostability
Thermostability quantifies the ability of proteinrsquos secondary and tertiarystructures to withstand high temperatures avoiding denaturation
Thermostability is typically measured experimentally by T50 the temperature atwhich 50 of the proteins are inactivated in 10 minutes
Increasing the thermostability can be considered the first step in proteinengineering in order to make the protein tolerant to a greater range of amino acidsubstitutions
Main design techniquesSequence-based design comparison through multiple alignmentsStructure-based approach assumes that a more rigid protein will be more stable athigh temperatures
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 19 40
Goal 2 Increasing the Catalytic Activity
How to quantify enzyme activity Michaelis-Menten model of kinetics
E + Sk1
kminus1
ES k2
E + P (1)
d [ES]
dt= k1[E ][S]minus [ES](kminus1 + k2) (2)
d [P]
dt= k2[ES] (3)
k2 is also known as kcat or turnover rate (in morecomplex cases kcat is function of several rates)
kcat alone is not enough we need to quantify the affinityof the enzyme to the substrate
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 20 40
Enzyme Kinetics
AssumptionsFirst assumption the concentration of the substrate-bound enzyme [ES] isapproximately constant compared with the rate of change of the concentration ofsubstrate [S] and product [P]
d [ES]
dt= k1[E ][S]minus [ES](kminus1 + k2) asymp 0 (4)
Second assumption the total concentration of enzyme [E ]0 does not changewith time
[E ]0 = [E ] + [ES] asymp const (5)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 21 40
The Michaelis constant KM
0 = k1[S]([E ]0 minus [ES])minus [ES](kminus1 + k2) (6)
k1[S][E ]0 = k1[S][ES] + [ES](kminus1 + k2) (7)
[S][E ]0 = [S][ES] + [ES]kminus1 + k2
k1(8)
(9)
KM Michaelis constant
KM =kminus1 + k2
k1(10)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 22 40
The Michaelis Constant KM and the steady-state flux
Rate of product formation (flux)
d [P]
dt= v = k2[ES] = k2[E ]0
[S]
KM + [S](11)
v =vmax [S]
KM + [S]=
11 + KM
[S]
vmax (12)
KM can be measured as the concentration of substrate [S] that corresponds to aproduct formation yield half of the maximum
v =vmax
2(13)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 23 40
Determining KM from the concentration curve
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 24 40
Evaluating Enzyme Efficiency
kcatKM is often used as a specificity constant to compare relative enzyme ratesof reaction of pairs of substrates transformed by an enzyme
For an enzyme acting simultaneously on two substrates SA SB at rates vA vB
vA
vB=
kAcatK A
M [SA]
kBcatK B
M [SB](14)
At [SA] = [SB] kcatKM provides a measure of substrate promiscuity efficiency
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 25 40
Goal 3 Protein Binding Affinity and Specificity
Proteins can bind to different partners
Protein-ligand binding interaction with a small molecule such as drug-target orenzyme-substrate
Protein-nucleotide (DNARNA) binding in transcription regulation promotersetc
Protein-protein interactionPermanent or obligated in multi-units proteins it could have a structural or functionalroleTransient in signaling transport and regulation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 26 40
31 Protein Binding Affinity
Dissociation constant
A + Bk1
kminus1
AB (15)
d [AB]
dt= k1[A][B]minus kminus1[AB] (16)
In equilibrium
0 = k1[A][B]minus kminus1[AB] (17)
kd =kminus1
k1=
[A][B]
[AB](18)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 27 40
31 Protein Binding Affinity
Affinity constant
ka =1kd
(19)
In antibodies
Ab + Agkforward
kback
AbAg (20)
Binding free energy
∆G = minusRT ln ka = minusRT ln1kd
(21)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 28 40
Simplified Thermodynamics of an Enzymatic Reaction
[Jonas and Hollfelder in Protein Engineering Handbook (2009)]
Ground-state binding (KM )
Transition-state binding (Ktx )
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 29 40
32 Protein Binding Specificity
These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation
Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners
KI inhibition constant When an inhibitor competes with a ligand
Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands
Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs
Promiscuity the ability to participate n a function other than the native one
Allostery regulation of a protein by binding of some ligand (the effector)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40
Thermodynamics of a Reaction with 2 Competing Substrates
[Desari and Miller in Protein Engineering Handbook (2009)]
Specificity reflects differences in the absolute heights of the transition states
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40
Introducing the Substitutions
Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point
mutation deletion or insertion) is annealed to the targetregion
4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template
5 The heteroduplex is propagated by transformation in Ecoli
Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40
Recombination and DNA-shuffling
A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology
DNA-shuffling to perform functionaldomain or motif shuffling in vitro
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40
Recombinant Protein Folding
E coli is a typically first choice for expressing a heterologous protein
However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli
Some misfolding-related issues
Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases
The environment (crowding pH osmolarity etc)
Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)
Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis
E coli expressing human leptin as
inclusion body
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40
Directed Evolution
A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold
Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities
An iterative process
Identifying a good starting sequence usually containing some level of latentpromiscuity
Creation of a library of variants
Selecting variants with improved function (mutation and screening)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40
From Natural Enzymes to Protein Engineeringto Computational Protein Design
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40
Computational Protein Design1 Challenges in Protein Engineering
Pablo Carbonellpablocarbonellissbgenopolefr
iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France
mSSB December 2010
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40
Bibliography I
David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232
Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013
Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174
D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]
Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359
James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40
Protein Specificity and Promiscuity
Multispecificity broad partner specificity(multiple substrates proteins ligands)
Small molecule ligand similar chemicalstructure usually with stereoselectivityProteins or peptides structural similar motifsrather than sequence motifs
Promiscuity the ability to participate in afunction other than the native one(moonlighting)
Allostery regulation of the protein by bindingof some ligand (the effector) at the allostericsite
Lock and key Induced fit
[Fischer 1894] [Koshland 1958]
Conformational selection
[Boehr et al 2009]
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 14 40
Protein Specificity and Promiscuity The Case of PPIs
PPI any physical binding between proteins that occurin vivo in the cellPPI screening methods still have some limitations
Y2H high FP-rateTAP-MS limited scalabilityLuminiscence-based methods proteome chipsco-immunoprecipitation MS real-time analysis (3rdgeneration DNA-seq)
Transient and PTM-dependent interactions are oftenmissed
Biological context developmental stageco-localization protein modifications presence ofcofactors presence of other binding partners
Protein hubs highly connected proteins related toessentiality robustness modularity evolvability Partyand date hubs under debate
single-interface multi-interface
[Kim et al 2006]
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 15 40
Data Sources
Enzymatic activityBRENDA experimental parametersKEGG MetaCyc metabolic networksCatalytic Site Atlas catalytic sites
Data validation and predictionGeneMANIA lists of genes with functionally similar or shared propertiesSTRING based on genomic context HT experiments co-expression literatureComPASS assign confidence to an interaction detected by MS
Primary PPI databasesDIP BioGRID IntAct MINTCommon languages PSICQUIC expression co-localization genetic metabolicsignaling pathways experimental data SBMLBuilding the network Cytoscape
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 16 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 17 40
Overview of Protein Engineering Technology
From a need to adjust enzyme properties for industrial processes
to the challenge of generating novel proteins for therapeutic and biomedicalapplications
GoalsIncreased catalytic function related to the parent
Altered specificity stereospecificity or affinity to interacting partners
Increased stability
Property ParametersThermostability T50
Catalytic activity kcat KM kcatKM
Binding specificity (kcatKM )A(kcatKM )B
Kd KI
Binding affinity Ka = 1Kd
∆G = minusRT ln 1Kd
A paradigm shift in the last 2decades
PCR and recombinant genetechnologies
Recreation of evolution in thelab
Computer algorithms
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 18 40
Goal 1 Increasing the Thermostability
Thermostability quantifies the ability of proteinrsquos secondary and tertiarystructures to withstand high temperatures avoiding denaturation
Thermostability is typically measured experimentally by T50 the temperature atwhich 50 of the proteins are inactivated in 10 minutes
Increasing the thermostability can be considered the first step in proteinengineering in order to make the protein tolerant to a greater range of amino acidsubstitutions
Main design techniquesSequence-based design comparison through multiple alignmentsStructure-based approach assumes that a more rigid protein will be more stable athigh temperatures
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 19 40
Goal 2 Increasing the Catalytic Activity
How to quantify enzyme activity Michaelis-Menten model of kinetics
E + Sk1
kminus1
ES k2
E + P (1)
d [ES]
dt= k1[E ][S]minus [ES](kminus1 + k2) (2)
d [P]
dt= k2[ES] (3)
k2 is also known as kcat or turnover rate (in morecomplex cases kcat is function of several rates)
kcat alone is not enough we need to quantify the affinityof the enzyme to the substrate
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 20 40
Enzyme Kinetics
AssumptionsFirst assumption the concentration of the substrate-bound enzyme [ES] isapproximately constant compared with the rate of change of the concentration ofsubstrate [S] and product [P]
d [ES]
dt= k1[E ][S]minus [ES](kminus1 + k2) asymp 0 (4)
Second assumption the total concentration of enzyme [E ]0 does not changewith time
[E ]0 = [E ] + [ES] asymp const (5)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 21 40
The Michaelis constant KM
0 = k1[S]([E ]0 minus [ES])minus [ES](kminus1 + k2) (6)
k1[S][E ]0 = k1[S][ES] + [ES](kminus1 + k2) (7)
[S][E ]0 = [S][ES] + [ES]kminus1 + k2
k1(8)
(9)
KM Michaelis constant
KM =kminus1 + k2
k1(10)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 22 40
The Michaelis Constant KM and the steady-state flux
Rate of product formation (flux)
d [P]
dt= v = k2[ES] = k2[E ]0
[S]
KM + [S](11)
v =vmax [S]
KM + [S]=
11 + KM
[S]
vmax (12)
KM can be measured as the concentration of substrate [S] that corresponds to aproduct formation yield half of the maximum
v =vmax
2(13)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 23 40
Determining KM from the concentration curve
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 24 40
Evaluating Enzyme Efficiency
kcatKM is often used as a specificity constant to compare relative enzyme ratesof reaction of pairs of substrates transformed by an enzyme
For an enzyme acting simultaneously on two substrates SA SB at rates vA vB
vA
vB=
kAcatK A
M [SA]
kBcatK B
M [SB](14)
At [SA] = [SB] kcatKM provides a measure of substrate promiscuity efficiency
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 25 40
Goal 3 Protein Binding Affinity and Specificity
Proteins can bind to different partners
Protein-ligand binding interaction with a small molecule such as drug-target orenzyme-substrate
Protein-nucleotide (DNARNA) binding in transcription regulation promotersetc
Protein-protein interactionPermanent or obligated in multi-units proteins it could have a structural or functionalroleTransient in signaling transport and regulation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 26 40
31 Protein Binding Affinity
Dissociation constant
A + Bk1
kminus1
AB (15)
d [AB]
dt= k1[A][B]minus kminus1[AB] (16)
In equilibrium
0 = k1[A][B]minus kminus1[AB] (17)
kd =kminus1
k1=
[A][B]
[AB](18)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 27 40
31 Protein Binding Affinity
Affinity constant
ka =1kd
(19)
In antibodies
Ab + Agkforward
kback
AbAg (20)
Binding free energy
∆G = minusRT ln ka = minusRT ln1kd
(21)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 28 40
Simplified Thermodynamics of an Enzymatic Reaction
[Jonas and Hollfelder in Protein Engineering Handbook (2009)]
Ground-state binding (KM )
Transition-state binding (Ktx )
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 29 40
32 Protein Binding Specificity
These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation
Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners
KI inhibition constant When an inhibitor competes with a ligand
Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands
Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs
Promiscuity the ability to participate n a function other than the native one
Allostery regulation of a protein by binding of some ligand (the effector)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40
Thermodynamics of a Reaction with 2 Competing Substrates
[Desari and Miller in Protein Engineering Handbook (2009)]
Specificity reflects differences in the absolute heights of the transition states
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40
Introducing the Substitutions
Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point
mutation deletion or insertion) is annealed to the targetregion
4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template
5 The heteroduplex is propagated by transformation in Ecoli
Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40
Recombination and DNA-shuffling
A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology
DNA-shuffling to perform functionaldomain or motif shuffling in vitro
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40
Recombinant Protein Folding
E coli is a typically first choice for expressing a heterologous protein
However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli
Some misfolding-related issues
Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases
The environment (crowding pH osmolarity etc)
Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)
Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis
E coli expressing human leptin as
inclusion body
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40
Directed Evolution
A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold
Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities
An iterative process
Identifying a good starting sequence usually containing some level of latentpromiscuity
Creation of a library of variants
Selecting variants with improved function (mutation and screening)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40
From Natural Enzymes to Protein Engineeringto Computational Protein Design
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40
Computational Protein Design1 Challenges in Protein Engineering
Pablo Carbonellpablocarbonellissbgenopolefr
iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France
mSSB December 2010
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40
Bibliography I
David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232
Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013
Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174
D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]
Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359
James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40
Protein Specificity and Promiscuity The Case of PPIs
PPI any physical binding between proteins that occurin vivo in the cellPPI screening methods still have some limitations
Y2H high FP-rateTAP-MS limited scalabilityLuminiscence-based methods proteome chipsco-immunoprecipitation MS real-time analysis (3rdgeneration DNA-seq)
Transient and PTM-dependent interactions are oftenmissed
Biological context developmental stageco-localization protein modifications presence ofcofactors presence of other binding partners
Protein hubs highly connected proteins related toessentiality robustness modularity evolvability Partyand date hubs under debate
single-interface multi-interface
[Kim et al 2006]
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 15 40
Data Sources
Enzymatic activityBRENDA experimental parametersKEGG MetaCyc metabolic networksCatalytic Site Atlas catalytic sites
Data validation and predictionGeneMANIA lists of genes with functionally similar or shared propertiesSTRING based on genomic context HT experiments co-expression literatureComPASS assign confidence to an interaction detected by MS
Primary PPI databasesDIP BioGRID IntAct MINTCommon languages PSICQUIC expression co-localization genetic metabolicsignaling pathways experimental data SBMLBuilding the network Cytoscape
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 16 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 17 40
Overview of Protein Engineering Technology
From a need to adjust enzyme properties for industrial processes
to the challenge of generating novel proteins for therapeutic and biomedicalapplications
GoalsIncreased catalytic function related to the parent
Altered specificity stereospecificity or affinity to interacting partners
Increased stability
Property ParametersThermostability T50
Catalytic activity kcat KM kcatKM
Binding specificity (kcatKM )A(kcatKM )B
Kd KI
Binding affinity Ka = 1Kd
∆G = minusRT ln 1Kd
A paradigm shift in the last 2decades
PCR and recombinant genetechnologies
Recreation of evolution in thelab
Computer algorithms
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 18 40
Goal 1 Increasing the Thermostability
Thermostability quantifies the ability of proteinrsquos secondary and tertiarystructures to withstand high temperatures avoiding denaturation
Thermostability is typically measured experimentally by T50 the temperature atwhich 50 of the proteins are inactivated in 10 minutes
Increasing the thermostability can be considered the first step in proteinengineering in order to make the protein tolerant to a greater range of amino acidsubstitutions
Main design techniquesSequence-based design comparison through multiple alignmentsStructure-based approach assumes that a more rigid protein will be more stable athigh temperatures
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 19 40
Goal 2 Increasing the Catalytic Activity
How to quantify enzyme activity Michaelis-Menten model of kinetics
E + Sk1
kminus1
ES k2
E + P (1)
d [ES]
dt= k1[E ][S]minus [ES](kminus1 + k2) (2)
d [P]
dt= k2[ES] (3)
k2 is also known as kcat or turnover rate (in morecomplex cases kcat is function of several rates)
kcat alone is not enough we need to quantify the affinityof the enzyme to the substrate
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 20 40
Enzyme Kinetics
AssumptionsFirst assumption the concentration of the substrate-bound enzyme [ES] isapproximately constant compared with the rate of change of the concentration ofsubstrate [S] and product [P]
d [ES]
dt= k1[E ][S]minus [ES](kminus1 + k2) asymp 0 (4)
Second assumption the total concentration of enzyme [E ]0 does not changewith time
[E ]0 = [E ] + [ES] asymp const (5)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 21 40
The Michaelis constant KM
0 = k1[S]([E ]0 minus [ES])minus [ES](kminus1 + k2) (6)
k1[S][E ]0 = k1[S][ES] + [ES](kminus1 + k2) (7)
[S][E ]0 = [S][ES] + [ES]kminus1 + k2
k1(8)
(9)
KM Michaelis constant
KM =kminus1 + k2
k1(10)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 22 40
The Michaelis Constant KM and the steady-state flux
Rate of product formation (flux)
d [P]
dt= v = k2[ES] = k2[E ]0
[S]
KM + [S](11)
v =vmax [S]
KM + [S]=
11 + KM
[S]
vmax (12)
KM can be measured as the concentration of substrate [S] that corresponds to aproduct formation yield half of the maximum
v =vmax
2(13)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 23 40
Determining KM from the concentration curve
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 24 40
Evaluating Enzyme Efficiency
kcatKM is often used as a specificity constant to compare relative enzyme ratesof reaction of pairs of substrates transformed by an enzyme
For an enzyme acting simultaneously on two substrates SA SB at rates vA vB
vA
vB=
kAcatK A
M [SA]
kBcatK B
M [SB](14)
At [SA] = [SB] kcatKM provides a measure of substrate promiscuity efficiency
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 25 40
Goal 3 Protein Binding Affinity and Specificity
Proteins can bind to different partners
Protein-ligand binding interaction with a small molecule such as drug-target orenzyme-substrate
Protein-nucleotide (DNARNA) binding in transcription regulation promotersetc
Protein-protein interactionPermanent or obligated in multi-units proteins it could have a structural or functionalroleTransient in signaling transport and regulation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 26 40
31 Protein Binding Affinity
Dissociation constant
A + Bk1
kminus1
AB (15)
d [AB]
dt= k1[A][B]minus kminus1[AB] (16)
In equilibrium
0 = k1[A][B]minus kminus1[AB] (17)
kd =kminus1
k1=
[A][B]
[AB](18)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 27 40
31 Protein Binding Affinity
Affinity constant
ka =1kd
(19)
In antibodies
Ab + Agkforward
kback
AbAg (20)
Binding free energy
∆G = minusRT ln ka = minusRT ln1kd
(21)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 28 40
Simplified Thermodynamics of an Enzymatic Reaction
[Jonas and Hollfelder in Protein Engineering Handbook (2009)]
Ground-state binding (KM )
Transition-state binding (Ktx )
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 29 40
32 Protein Binding Specificity
These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation
Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners
KI inhibition constant When an inhibitor competes with a ligand
Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands
Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs
Promiscuity the ability to participate n a function other than the native one
Allostery regulation of a protein by binding of some ligand (the effector)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40
Thermodynamics of a Reaction with 2 Competing Substrates
[Desari and Miller in Protein Engineering Handbook (2009)]
Specificity reflects differences in the absolute heights of the transition states
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40
Introducing the Substitutions
Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point
mutation deletion or insertion) is annealed to the targetregion
4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template
5 The heteroduplex is propagated by transformation in Ecoli
Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40
Recombination and DNA-shuffling
A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology
DNA-shuffling to perform functionaldomain or motif shuffling in vitro
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40
Recombinant Protein Folding
E coli is a typically first choice for expressing a heterologous protein
However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli
Some misfolding-related issues
Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases
The environment (crowding pH osmolarity etc)
Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)
Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis
E coli expressing human leptin as
inclusion body
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40
Directed Evolution
A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold
Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities
An iterative process
Identifying a good starting sequence usually containing some level of latentpromiscuity
Creation of a library of variants
Selecting variants with improved function (mutation and screening)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40
From Natural Enzymes to Protein Engineeringto Computational Protein Design
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40
Computational Protein Design1 Challenges in Protein Engineering
Pablo Carbonellpablocarbonellissbgenopolefr
iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France
mSSB December 2010
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40
Bibliography I
David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232
Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013
Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174
D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]
Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359
James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40
Data Sources
Enzymatic activityBRENDA experimental parametersKEGG MetaCyc metabolic networksCatalytic Site Atlas catalytic sites
Data validation and predictionGeneMANIA lists of genes with functionally similar or shared propertiesSTRING based on genomic context HT experiments co-expression literatureComPASS assign confidence to an interaction detected by MS
Primary PPI databasesDIP BioGRID IntAct MINTCommon languages PSICQUIC expression co-localization genetic metabolicsignaling pathways experimental data SBMLBuilding the network Cytoscape
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 16 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 17 40
Overview of Protein Engineering Technology
From a need to adjust enzyme properties for industrial processes
to the challenge of generating novel proteins for therapeutic and biomedicalapplications
GoalsIncreased catalytic function related to the parent
Altered specificity stereospecificity or affinity to interacting partners
Increased stability
Property ParametersThermostability T50
Catalytic activity kcat KM kcatKM
Binding specificity (kcatKM )A(kcatKM )B
Kd KI
Binding affinity Ka = 1Kd
∆G = minusRT ln 1Kd
A paradigm shift in the last 2decades
PCR and recombinant genetechnologies
Recreation of evolution in thelab
Computer algorithms
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 18 40
Goal 1 Increasing the Thermostability
Thermostability quantifies the ability of proteinrsquos secondary and tertiarystructures to withstand high temperatures avoiding denaturation
Thermostability is typically measured experimentally by T50 the temperature atwhich 50 of the proteins are inactivated in 10 minutes
Increasing the thermostability can be considered the first step in proteinengineering in order to make the protein tolerant to a greater range of amino acidsubstitutions
Main design techniquesSequence-based design comparison through multiple alignmentsStructure-based approach assumes that a more rigid protein will be more stable athigh temperatures
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 19 40
Goal 2 Increasing the Catalytic Activity
How to quantify enzyme activity Michaelis-Menten model of kinetics
E + Sk1
kminus1
ES k2
E + P (1)
d [ES]
dt= k1[E ][S]minus [ES](kminus1 + k2) (2)
d [P]
dt= k2[ES] (3)
k2 is also known as kcat or turnover rate (in morecomplex cases kcat is function of several rates)
kcat alone is not enough we need to quantify the affinityof the enzyme to the substrate
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 20 40
Enzyme Kinetics
AssumptionsFirst assumption the concentration of the substrate-bound enzyme [ES] isapproximately constant compared with the rate of change of the concentration ofsubstrate [S] and product [P]
d [ES]
dt= k1[E ][S]minus [ES](kminus1 + k2) asymp 0 (4)
Second assumption the total concentration of enzyme [E ]0 does not changewith time
[E ]0 = [E ] + [ES] asymp const (5)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 21 40
The Michaelis constant KM
0 = k1[S]([E ]0 minus [ES])minus [ES](kminus1 + k2) (6)
k1[S][E ]0 = k1[S][ES] + [ES](kminus1 + k2) (7)
[S][E ]0 = [S][ES] + [ES]kminus1 + k2
k1(8)
(9)
KM Michaelis constant
KM =kminus1 + k2
k1(10)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 22 40
The Michaelis Constant KM and the steady-state flux
Rate of product formation (flux)
d [P]
dt= v = k2[ES] = k2[E ]0
[S]
KM + [S](11)
v =vmax [S]
KM + [S]=
11 + KM
[S]
vmax (12)
KM can be measured as the concentration of substrate [S] that corresponds to aproduct formation yield half of the maximum
v =vmax
2(13)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 23 40
Determining KM from the concentration curve
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 24 40
Evaluating Enzyme Efficiency
kcatKM is often used as a specificity constant to compare relative enzyme ratesof reaction of pairs of substrates transformed by an enzyme
For an enzyme acting simultaneously on two substrates SA SB at rates vA vB
vA
vB=
kAcatK A
M [SA]
kBcatK B
M [SB](14)
At [SA] = [SB] kcatKM provides a measure of substrate promiscuity efficiency
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 25 40
Goal 3 Protein Binding Affinity and Specificity
Proteins can bind to different partners
Protein-ligand binding interaction with a small molecule such as drug-target orenzyme-substrate
Protein-nucleotide (DNARNA) binding in transcription regulation promotersetc
Protein-protein interactionPermanent or obligated in multi-units proteins it could have a structural or functionalroleTransient in signaling transport and regulation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 26 40
31 Protein Binding Affinity
Dissociation constant
A + Bk1
kminus1
AB (15)
d [AB]
dt= k1[A][B]minus kminus1[AB] (16)
In equilibrium
0 = k1[A][B]minus kminus1[AB] (17)
kd =kminus1
k1=
[A][B]
[AB](18)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 27 40
31 Protein Binding Affinity
Affinity constant
ka =1kd
(19)
In antibodies
Ab + Agkforward
kback
AbAg (20)
Binding free energy
∆G = minusRT ln ka = minusRT ln1kd
(21)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 28 40
Simplified Thermodynamics of an Enzymatic Reaction
[Jonas and Hollfelder in Protein Engineering Handbook (2009)]
Ground-state binding (KM )
Transition-state binding (Ktx )
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 29 40
32 Protein Binding Specificity
These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation
Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners
KI inhibition constant When an inhibitor competes with a ligand
Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands
Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs
Promiscuity the ability to participate n a function other than the native one
Allostery regulation of a protein by binding of some ligand (the effector)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40
Thermodynamics of a Reaction with 2 Competing Substrates
[Desari and Miller in Protein Engineering Handbook (2009)]
Specificity reflects differences in the absolute heights of the transition states
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40
Introducing the Substitutions
Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point
mutation deletion or insertion) is annealed to the targetregion
4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template
5 The heteroduplex is propagated by transformation in Ecoli
Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40
Recombination and DNA-shuffling
A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology
DNA-shuffling to perform functionaldomain or motif shuffling in vitro
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40
Recombinant Protein Folding
E coli is a typically first choice for expressing a heterologous protein
However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli
Some misfolding-related issues
Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases
The environment (crowding pH osmolarity etc)
Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)
Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis
E coli expressing human leptin as
inclusion body
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40
Directed Evolution
A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold
Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities
An iterative process
Identifying a good starting sequence usually containing some level of latentpromiscuity
Creation of a library of variants
Selecting variants with improved function (mutation and screening)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40
From Natural Enzymes to Protein Engineeringto Computational Protein Design
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40
Computational Protein Design1 Challenges in Protein Engineering
Pablo Carbonellpablocarbonellissbgenopolefr
iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France
mSSB December 2010
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40
Bibliography I
David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232
Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013
Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174
D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]
Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359
James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 17 40
Overview of Protein Engineering Technology
From a need to adjust enzyme properties for industrial processes
to the challenge of generating novel proteins for therapeutic and biomedicalapplications
GoalsIncreased catalytic function related to the parent
Altered specificity stereospecificity or affinity to interacting partners
Increased stability
Property ParametersThermostability T50
Catalytic activity kcat KM kcatKM
Binding specificity (kcatKM )A(kcatKM )B
Kd KI
Binding affinity Ka = 1Kd
∆G = minusRT ln 1Kd
A paradigm shift in the last 2decades
PCR and recombinant genetechnologies
Recreation of evolution in thelab
Computer algorithms
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 18 40
Goal 1 Increasing the Thermostability
Thermostability quantifies the ability of proteinrsquos secondary and tertiarystructures to withstand high temperatures avoiding denaturation
Thermostability is typically measured experimentally by T50 the temperature atwhich 50 of the proteins are inactivated in 10 minutes
Increasing the thermostability can be considered the first step in proteinengineering in order to make the protein tolerant to a greater range of amino acidsubstitutions
Main design techniquesSequence-based design comparison through multiple alignmentsStructure-based approach assumes that a more rigid protein will be more stable athigh temperatures
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 19 40
Goal 2 Increasing the Catalytic Activity
How to quantify enzyme activity Michaelis-Menten model of kinetics
E + Sk1
kminus1
ES k2
E + P (1)
d [ES]
dt= k1[E ][S]minus [ES](kminus1 + k2) (2)
d [P]
dt= k2[ES] (3)
k2 is also known as kcat or turnover rate (in morecomplex cases kcat is function of several rates)
kcat alone is not enough we need to quantify the affinityof the enzyme to the substrate
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 20 40
Enzyme Kinetics
AssumptionsFirst assumption the concentration of the substrate-bound enzyme [ES] isapproximately constant compared with the rate of change of the concentration ofsubstrate [S] and product [P]
d [ES]
dt= k1[E ][S]minus [ES](kminus1 + k2) asymp 0 (4)
Second assumption the total concentration of enzyme [E ]0 does not changewith time
[E ]0 = [E ] + [ES] asymp const (5)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 21 40
The Michaelis constant KM
0 = k1[S]([E ]0 minus [ES])minus [ES](kminus1 + k2) (6)
k1[S][E ]0 = k1[S][ES] + [ES](kminus1 + k2) (7)
[S][E ]0 = [S][ES] + [ES]kminus1 + k2
k1(8)
(9)
KM Michaelis constant
KM =kminus1 + k2
k1(10)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 22 40
The Michaelis Constant KM and the steady-state flux
Rate of product formation (flux)
d [P]
dt= v = k2[ES] = k2[E ]0
[S]
KM + [S](11)
v =vmax [S]
KM + [S]=
11 + KM
[S]
vmax (12)
KM can be measured as the concentration of substrate [S] that corresponds to aproduct formation yield half of the maximum
v =vmax
2(13)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 23 40
Determining KM from the concentration curve
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 24 40
Evaluating Enzyme Efficiency
kcatKM is often used as a specificity constant to compare relative enzyme ratesof reaction of pairs of substrates transformed by an enzyme
For an enzyme acting simultaneously on two substrates SA SB at rates vA vB
vA
vB=
kAcatK A
M [SA]
kBcatK B
M [SB](14)
At [SA] = [SB] kcatKM provides a measure of substrate promiscuity efficiency
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 25 40
Goal 3 Protein Binding Affinity and Specificity
Proteins can bind to different partners
Protein-ligand binding interaction with a small molecule such as drug-target orenzyme-substrate
Protein-nucleotide (DNARNA) binding in transcription regulation promotersetc
Protein-protein interactionPermanent or obligated in multi-units proteins it could have a structural or functionalroleTransient in signaling transport and regulation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 26 40
31 Protein Binding Affinity
Dissociation constant
A + Bk1
kminus1
AB (15)
d [AB]
dt= k1[A][B]minus kminus1[AB] (16)
In equilibrium
0 = k1[A][B]minus kminus1[AB] (17)
kd =kminus1
k1=
[A][B]
[AB](18)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 27 40
31 Protein Binding Affinity
Affinity constant
ka =1kd
(19)
In antibodies
Ab + Agkforward
kback
AbAg (20)
Binding free energy
∆G = minusRT ln ka = minusRT ln1kd
(21)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 28 40
Simplified Thermodynamics of an Enzymatic Reaction
[Jonas and Hollfelder in Protein Engineering Handbook (2009)]
Ground-state binding (KM )
Transition-state binding (Ktx )
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 29 40
32 Protein Binding Specificity
These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation
Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners
KI inhibition constant When an inhibitor competes with a ligand
Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands
Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs
Promiscuity the ability to participate n a function other than the native one
Allostery regulation of a protein by binding of some ligand (the effector)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40
Thermodynamics of a Reaction with 2 Competing Substrates
[Desari and Miller in Protein Engineering Handbook (2009)]
Specificity reflects differences in the absolute heights of the transition states
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40
Introducing the Substitutions
Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point
mutation deletion or insertion) is annealed to the targetregion
4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template
5 The heteroduplex is propagated by transformation in Ecoli
Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40
Recombination and DNA-shuffling
A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology
DNA-shuffling to perform functionaldomain or motif shuffling in vitro
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40
Recombinant Protein Folding
E coli is a typically first choice for expressing a heterologous protein
However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli
Some misfolding-related issues
Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases
The environment (crowding pH osmolarity etc)
Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)
Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis
E coli expressing human leptin as
inclusion body
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40
Directed Evolution
A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold
Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities
An iterative process
Identifying a good starting sequence usually containing some level of latentpromiscuity
Creation of a library of variants
Selecting variants with improved function (mutation and screening)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40
From Natural Enzymes to Protein Engineeringto Computational Protein Design
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40
Computational Protein Design1 Challenges in Protein Engineering
Pablo Carbonellpablocarbonellissbgenopolefr
iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France
mSSB December 2010
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40
Bibliography I
David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232
Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013
Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174
D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]
Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359
James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40
Overview of Protein Engineering Technology
From a need to adjust enzyme properties for industrial processes
to the challenge of generating novel proteins for therapeutic and biomedicalapplications
GoalsIncreased catalytic function related to the parent
Altered specificity stereospecificity or affinity to interacting partners
Increased stability
Property ParametersThermostability T50
Catalytic activity kcat KM kcatKM
Binding specificity (kcatKM )A(kcatKM )B
Kd KI
Binding affinity Ka = 1Kd
∆G = minusRT ln 1Kd
A paradigm shift in the last 2decades
PCR and recombinant genetechnologies
Recreation of evolution in thelab
Computer algorithms
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 18 40
Goal 1 Increasing the Thermostability
Thermostability quantifies the ability of proteinrsquos secondary and tertiarystructures to withstand high temperatures avoiding denaturation
Thermostability is typically measured experimentally by T50 the temperature atwhich 50 of the proteins are inactivated in 10 minutes
Increasing the thermostability can be considered the first step in proteinengineering in order to make the protein tolerant to a greater range of amino acidsubstitutions
Main design techniquesSequence-based design comparison through multiple alignmentsStructure-based approach assumes that a more rigid protein will be more stable athigh temperatures
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 19 40
Goal 2 Increasing the Catalytic Activity
How to quantify enzyme activity Michaelis-Menten model of kinetics
E + Sk1
kminus1
ES k2
E + P (1)
d [ES]
dt= k1[E ][S]minus [ES](kminus1 + k2) (2)
d [P]
dt= k2[ES] (3)
k2 is also known as kcat or turnover rate (in morecomplex cases kcat is function of several rates)
kcat alone is not enough we need to quantify the affinityof the enzyme to the substrate
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 20 40
Enzyme Kinetics
AssumptionsFirst assumption the concentration of the substrate-bound enzyme [ES] isapproximately constant compared with the rate of change of the concentration ofsubstrate [S] and product [P]
d [ES]
dt= k1[E ][S]minus [ES](kminus1 + k2) asymp 0 (4)
Second assumption the total concentration of enzyme [E ]0 does not changewith time
[E ]0 = [E ] + [ES] asymp const (5)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 21 40
The Michaelis constant KM
0 = k1[S]([E ]0 minus [ES])minus [ES](kminus1 + k2) (6)
k1[S][E ]0 = k1[S][ES] + [ES](kminus1 + k2) (7)
[S][E ]0 = [S][ES] + [ES]kminus1 + k2
k1(8)
(9)
KM Michaelis constant
KM =kminus1 + k2
k1(10)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 22 40
The Michaelis Constant KM and the steady-state flux
Rate of product formation (flux)
d [P]
dt= v = k2[ES] = k2[E ]0
[S]
KM + [S](11)
v =vmax [S]
KM + [S]=
11 + KM
[S]
vmax (12)
KM can be measured as the concentration of substrate [S] that corresponds to aproduct formation yield half of the maximum
v =vmax
2(13)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 23 40
Determining KM from the concentration curve
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 24 40
Evaluating Enzyme Efficiency
kcatKM is often used as a specificity constant to compare relative enzyme ratesof reaction of pairs of substrates transformed by an enzyme
For an enzyme acting simultaneously on two substrates SA SB at rates vA vB
vA
vB=
kAcatK A
M [SA]
kBcatK B
M [SB](14)
At [SA] = [SB] kcatKM provides a measure of substrate promiscuity efficiency
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 25 40
Goal 3 Protein Binding Affinity and Specificity
Proteins can bind to different partners
Protein-ligand binding interaction with a small molecule such as drug-target orenzyme-substrate
Protein-nucleotide (DNARNA) binding in transcription regulation promotersetc
Protein-protein interactionPermanent or obligated in multi-units proteins it could have a structural or functionalroleTransient in signaling transport and regulation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 26 40
31 Protein Binding Affinity
Dissociation constant
A + Bk1
kminus1
AB (15)
d [AB]
dt= k1[A][B]minus kminus1[AB] (16)
In equilibrium
0 = k1[A][B]minus kminus1[AB] (17)
kd =kminus1
k1=
[A][B]
[AB](18)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 27 40
31 Protein Binding Affinity
Affinity constant
ka =1kd
(19)
In antibodies
Ab + Agkforward
kback
AbAg (20)
Binding free energy
∆G = minusRT ln ka = minusRT ln1kd
(21)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 28 40
Simplified Thermodynamics of an Enzymatic Reaction
[Jonas and Hollfelder in Protein Engineering Handbook (2009)]
Ground-state binding (KM )
Transition-state binding (Ktx )
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 29 40
32 Protein Binding Specificity
These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation
Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners
KI inhibition constant When an inhibitor competes with a ligand
Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands
Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs
Promiscuity the ability to participate n a function other than the native one
Allostery regulation of a protein by binding of some ligand (the effector)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40
Thermodynamics of a Reaction with 2 Competing Substrates
[Desari and Miller in Protein Engineering Handbook (2009)]
Specificity reflects differences in the absolute heights of the transition states
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40
Introducing the Substitutions
Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point
mutation deletion or insertion) is annealed to the targetregion
4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template
5 The heteroduplex is propagated by transformation in Ecoli
Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40
Recombination and DNA-shuffling
A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology
DNA-shuffling to perform functionaldomain or motif shuffling in vitro
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40
Recombinant Protein Folding
E coli is a typically first choice for expressing a heterologous protein
However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli
Some misfolding-related issues
Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases
The environment (crowding pH osmolarity etc)
Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)
Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis
E coli expressing human leptin as
inclusion body
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40
Directed Evolution
A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold
Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities
An iterative process
Identifying a good starting sequence usually containing some level of latentpromiscuity
Creation of a library of variants
Selecting variants with improved function (mutation and screening)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40
From Natural Enzymes to Protein Engineeringto Computational Protein Design
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40
Computational Protein Design1 Challenges in Protein Engineering
Pablo Carbonellpablocarbonellissbgenopolefr
iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France
mSSB December 2010
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40
Bibliography I
David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232
Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013
Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174
D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]
Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359
James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40
Goal 1 Increasing the Thermostability
Thermostability quantifies the ability of proteinrsquos secondary and tertiarystructures to withstand high temperatures avoiding denaturation
Thermostability is typically measured experimentally by T50 the temperature atwhich 50 of the proteins are inactivated in 10 minutes
Increasing the thermostability can be considered the first step in proteinengineering in order to make the protein tolerant to a greater range of amino acidsubstitutions
Main design techniquesSequence-based design comparison through multiple alignmentsStructure-based approach assumes that a more rigid protein will be more stable athigh temperatures
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 19 40
Goal 2 Increasing the Catalytic Activity
How to quantify enzyme activity Michaelis-Menten model of kinetics
E + Sk1
kminus1
ES k2
E + P (1)
d [ES]
dt= k1[E ][S]minus [ES](kminus1 + k2) (2)
d [P]
dt= k2[ES] (3)
k2 is also known as kcat or turnover rate (in morecomplex cases kcat is function of several rates)
kcat alone is not enough we need to quantify the affinityof the enzyme to the substrate
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 20 40
Enzyme Kinetics
AssumptionsFirst assumption the concentration of the substrate-bound enzyme [ES] isapproximately constant compared with the rate of change of the concentration ofsubstrate [S] and product [P]
d [ES]
dt= k1[E ][S]minus [ES](kminus1 + k2) asymp 0 (4)
Second assumption the total concentration of enzyme [E ]0 does not changewith time
[E ]0 = [E ] + [ES] asymp const (5)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 21 40
The Michaelis constant KM
0 = k1[S]([E ]0 minus [ES])minus [ES](kminus1 + k2) (6)
k1[S][E ]0 = k1[S][ES] + [ES](kminus1 + k2) (7)
[S][E ]0 = [S][ES] + [ES]kminus1 + k2
k1(8)
(9)
KM Michaelis constant
KM =kminus1 + k2
k1(10)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 22 40
The Michaelis Constant KM and the steady-state flux
Rate of product formation (flux)
d [P]
dt= v = k2[ES] = k2[E ]0
[S]
KM + [S](11)
v =vmax [S]
KM + [S]=
11 + KM
[S]
vmax (12)
KM can be measured as the concentration of substrate [S] that corresponds to aproduct formation yield half of the maximum
v =vmax
2(13)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 23 40
Determining KM from the concentration curve
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 24 40
Evaluating Enzyme Efficiency
kcatKM is often used as a specificity constant to compare relative enzyme ratesof reaction of pairs of substrates transformed by an enzyme
For an enzyme acting simultaneously on two substrates SA SB at rates vA vB
vA
vB=
kAcatK A
M [SA]
kBcatK B
M [SB](14)
At [SA] = [SB] kcatKM provides a measure of substrate promiscuity efficiency
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 25 40
Goal 3 Protein Binding Affinity and Specificity
Proteins can bind to different partners
Protein-ligand binding interaction with a small molecule such as drug-target orenzyme-substrate
Protein-nucleotide (DNARNA) binding in transcription regulation promotersetc
Protein-protein interactionPermanent or obligated in multi-units proteins it could have a structural or functionalroleTransient in signaling transport and regulation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 26 40
31 Protein Binding Affinity
Dissociation constant
A + Bk1
kminus1
AB (15)
d [AB]
dt= k1[A][B]minus kminus1[AB] (16)
In equilibrium
0 = k1[A][B]minus kminus1[AB] (17)
kd =kminus1
k1=
[A][B]
[AB](18)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 27 40
31 Protein Binding Affinity
Affinity constant
ka =1kd
(19)
In antibodies
Ab + Agkforward
kback
AbAg (20)
Binding free energy
∆G = minusRT ln ka = minusRT ln1kd
(21)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 28 40
Simplified Thermodynamics of an Enzymatic Reaction
[Jonas and Hollfelder in Protein Engineering Handbook (2009)]
Ground-state binding (KM )
Transition-state binding (Ktx )
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 29 40
32 Protein Binding Specificity
These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation
Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners
KI inhibition constant When an inhibitor competes with a ligand
Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands
Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs
Promiscuity the ability to participate n a function other than the native one
Allostery regulation of a protein by binding of some ligand (the effector)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40
Thermodynamics of a Reaction with 2 Competing Substrates
[Desari and Miller in Protein Engineering Handbook (2009)]
Specificity reflects differences in the absolute heights of the transition states
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40
Introducing the Substitutions
Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point
mutation deletion or insertion) is annealed to the targetregion
4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template
5 The heteroduplex is propagated by transformation in Ecoli
Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40
Recombination and DNA-shuffling
A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology
DNA-shuffling to perform functionaldomain or motif shuffling in vitro
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40
Recombinant Protein Folding
E coli is a typically first choice for expressing a heterologous protein
However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli
Some misfolding-related issues
Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases
The environment (crowding pH osmolarity etc)
Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)
Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis
E coli expressing human leptin as
inclusion body
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40
Directed Evolution
A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold
Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities
An iterative process
Identifying a good starting sequence usually containing some level of latentpromiscuity
Creation of a library of variants
Selecting variants with improved function (mutation and screening)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40
From Natural Enzymes to Protein Engineeringto Computational Protein Design
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40
Computational Protein Design1 Challenges in Protein Engineering
Pablo Carbonellpablocarbonellissbgenopolefr
iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France
mSSB December 2010
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40
Bibliography I
David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232
Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013
Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174
D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]
Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359
James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40
Goal 2 Increasing the Catalytic Activity
How to quantify enzyme activity Michaelis-Menten model of kinetics
E + Sk1
kminus1
ES k2
E + P (1)
d [ES]
dt= k1[E ][S]minus [ES](kminus1 + k2) (2)
d [P]
dt= k2[ES] (3)
k2 is also known as kcat or turnover rate (in morecomplex cases kcat is function of several rates)
kcat alone is not enough we need to quantify the affinityof the enzyme to the substrate
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 20 40
Enzyme Kinetics
AssumptionsFirst assumption the concentration of the substrate-bound enzyme [ES] isapproximately constant compared with the rate of change of the concentration ofsubstrate [S] and product [P]
d [ES]
dt= k1[E ][S]minus [ES](kminus1 + k2) asymp 0 (4)
Second assumption the total concentration of enzyme [E ]0 does not changewith time
[E ]0 = [E ] + [ES] asymp const (5)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 21 40
The Michaelis constant KM
0 = k1[S]([E ]0 minus [ES])minus [ES](kminus1 + k2) (6)
k1[S][E ]0 = k1[S][ES] + [ES](kminus1 + k2) (7)
[S][E ]0 = [S][ES] + [ES]kminus1 + k2
k1(8)
(9)
KM Michaelis constant
KM =kminus1 + k2
k1(10)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 22 40
The Michaelis Constant KM and the steady-state flux
Rate of product formation (flux)
d [P]
dt= v = k2[ES] = k2[E ]0
[S]
KM + [S](11)
v =vmax [S]
KM + [S]=
11 + KM
[S]
vmax (12)
KM can be measured as the concentration of substrate [S] that corresponds to aproduct formation yield half of the maximum
v =vmax
2(13)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 23 40
Determining KM from the concentration curve
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 24 40
Evaluating Enzyme Efficiency
kcatKM is often used as a specificity constant to compare relative enzyme ratesof reaction of pairs of substrates transformed by an enzyme
For an enzyme acting simultaneously on two substrates SA SB at rates vA vB
vA
vB=
kAcatK A
M [SA]
kBcatK B
M [SB](14)
At [SA] = [SB] kcatKM provides a measure of substrate promiscuity efficiency
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 25 40
Goal 3 Protein Binding Affinity and Specificity
Proteins can bind to different partners
Protein-ligand binding interaction with a small molecule such as drug-target orenzyme-substrate
Protein-nucleotide (DNARNA) binding in transcription regulation promotersetc
Protein-protein interactionPermanent or obligated in multi-units proteins it could have a structural or functionalroleTransient in signaling transport and regulation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 26 40
31 Protein Binding Affinity
Dissociation constant
A + Bk1
kminus1
AB (15)
d [AB]
dt= k1[A][B]minus kminus1[AB] (16)
In equilibrium
0 = k1[A][B]minus kminus1[AB] (17)
kd =kminus1
k1=
[A][B]
[AB](18)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 27 40
31 Protein Binding Affinity
Affinity constant
ka =1kd
(19)
In antibodies
Ab + Agkforward
kback
AbAg (20)
Binding free energy
∆G = minusRT ln ka = minusRT ln1kd
(21)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 28 40
Simplified Thermodynamics of an Enzymatic Reaction
[Jonas and Hollfelder in Protein Engineering Handbook (2009)]
Ground-state binding (KM )
Transition-state binding (Ktx )
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 29 40
32 Protein Binding Specificity
These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation
Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners
KI inhibition constant When an inhibitor competes with a ligand
Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands
Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs
Promiscuity the ability to participate n a function other than the native one
Allostery regulation of a protein by binding of some ligand (the effector)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40
Thermodynamics of a Reaction with 2 Competing Substrates
[Desari and Miller in Protein Engineering Handbook (2009)]
Specificity reflects differences in the absolute heights of the transition states
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40
Introducing the Substitutions
Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point
mutation deletion or insertion) is annealed to the targetregion
4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template
5 The heteroduplex is propagated by transformation in Ecoli
Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40
Recombination and DNA-shuffling
A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology
DNA-shuffling to perform functionaldomain or motif shuffling in vitro
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40
Recombinant Protein Folding
E coli is a typically first choice for expressing a heterologous protein
However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli
Some misfolding-related issues
Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases
The environment (crowding pH osmolarity etc)
Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)
Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis
E coli expressing human leptin as
inclusion body
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40
Directed Evolution
A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold
Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities
An iterative process
Identifying a good starting sequence usually containing some level of latentpromiscuity
Creation of a library of variants
Selecting variants with improved function (mutation and screening)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40
From Natural Enzymes to Protein Engineeringto Computational Protein Design
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40
Computational Protein Design1 Challenges in Protein Engineering
Pablo Carbonellpablocarbonellissbgenopolefr
iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France
mSSB December 2010
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40
Bibliography I
David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232
Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013
Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174
D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]
Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359
James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40
Enzyme Kinetics
AssumptionsFirst assumption the concentration of the substrate-bound enzyme [ES] isapproximately constant compared with the rate of change of the concentration ofsubstrate [S] and product [P]
d [ES]
dt= k1[E ][S]minus [ES](kminus1 + k2) asymp 0 (4)
Second assumption the total concentration of enzyme [E ]0 does not changewith time
[E ]0 = [E ] + [ES] asymp const (5)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 21 40
The Michaelis constant KM
0 = k1[S]([E ]0 minus [ES])minus [ES](kminus1 + k2) (6)
k1[S][E ]0 = k1[S][ES] + [ES](kminus1 + k2) (7)
[S][E ]0 = [S][ES] + [ES]kminus1 + k2
k1(8)
(9)
KM Michaelis constant
KM =kminus1 + k2
k1(10)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 22 40
The Michaelis Constant KM and the steady-state flux
Rate of product formation (flux)
d [P]
dt= v = k2[ES] = k2[E ]0
[S]
KM + [S](11)
v =vmax [S]
KM + [S]=
11 + KM
[S]
vmax (12)
KM can be measured as the concentration of substrate [S] that corresponds to aproduct formation yield half of the maximum
v =vmax
2(13)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 23 40
Determining KM from the concentration curve
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 24 40
Evaluating Enzyme Efficiency
kcatKM is often used as a specificity constant to compare relative enzyme ratesof reaction of pairs of substrates transformed by an enzyme
For an enzyme acting simultaneously on two substrates SA SB at rates vA vB
vA
vB=
kAcatK A
M [SA]
kBcatK B
M [SB](14)
At [SA] = [SB] kcatKM provides a measure of substrate promiscuity efficiency
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 25 40
Goal 3 Protein Binding Affinity and Specificity
Proteins can bind to different partners
Protein-ligand binding interaction with a small molecule such as drug-target orenzyme-substrate
Protein-nucleotide (DNARNA) binding in transcription regulation promotersetc
Protein-protein interactionPermanent or obligated in multi-units proteins it could have a structural or functionalroleTransient in signaling transport and regulation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 26 40
31 Protein Binding Affinity
Dissociation constant
A + Bk1
kminus1
AB (15)
d [AB]
dt= k1[A][B]minus kminus1[AB] (16)
In equilibrium
0 = k1[A][B]minus kminus1[AB] (17)
kd =kminus1
k1=
[A][B]
[AB](18)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 27 40
31 Protein Binding Affinity
Affinity constant
ka =1kd
(19)
In antibodies
Ab + Agkforward
kback
AbAg (20)
Binding free energy
∆G = minusRT ln ka = minusRT ln1kd
(21)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 28 40
Simplified Thermodynamics of an Enzymatic Reaction
[Jonas and Hollfelder in Protein Engineering Handbook (2009)]
Ground-state binding (KM )
Transition-state binding (Ktx )
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 29 40
32 Protein Binding Specificity
These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation
Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners
KI inhibition constant When an inhibitor competes with a ligand
Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands
Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs
Promiscuity the ability to participate n a function other than the native one
Allostery regulation of a protein by binding of some ligand (the effector)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40
Thermodynamics of a Reaction with 2 Competing Substrates
[Desari and Miller in Protein Engineering Handbook (2009)]
Specificity reflects differences in the absolute heights of the transition states
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40
Introducing the Substitutions
Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point
mutation deletion or insertion) is annealed to the targetregion
4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template
5 The heteroduplex is propagated by transformation in Ecoli
Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40
Recombination and DNA-shuffling
A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology
DNA-shuffling to perform functionaldomain or motif shuffling in vitro
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40
Recombinant Protein Folding
E coli is a typically first choice for expressing a heterologous protein
However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli
Some misfolding-related issues
Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases
The environment (crowding pH osmolarity etc)
Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)
Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis
E coli expressing human leptin as
inclusion body
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40
Directed Evolution
A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold
Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities
An iterative process
Identifying a good starting sequence usually containing some level of latentpromiscuity
Creation of a library of variants
Selecting variants with improved function (mutation and screening)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40
From Natural Enzymes to Protein Engineeringto Computational Protein Design
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40
Computational Protein Design1 Challenges in Protein Engineering
Pablo Carbonellpablocarbonellissbgenopolefr
iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France
mSSB December 2010
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40
Bibliography I
David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232
Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013
Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174
D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]
Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359
James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40
The Michaelis constant KM
0 = k1[S]([E ]0 minus [ES])minus [ES](kminus1 + k2) (6)
k1[S][E ]0 = k1[S][ES] + [ES](kminus1 + k2) (7)
[S][E ]0 = [S][ES] + [ES]kminus1 + k2
k1(8)
(9)
KM Michaelis constant
KM =kminus1 + k2
k1(10)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 22 40
The Michaelis Constant KM and the steady-state flux
Rate of product formation (flux)
d [P]
dt= v = k2[ES] = k2[E ]0
[S]
KM + [S](11)
v =vmax [S]
KM + [S]=
11 + KM
[S]
vmax (12)
KM can be measured as the concentration of substrate [S] that corresponds to aproduct formation yield half of the maximum
v =vmax
2(13)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 23 40
Determining KM from the concentration curve
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 24 40
Evaluating Enzyme Efficiency
kcatKM is often used as a specificity constant to compare relative enzyme ratesof reaction of pairs of substrates transformed by an enzyme
For an enzyme acting simultaneously on two substrates SA SB at rates vA vB
vA
vB=
kAcatK A
M [SA]
kBcatK B
M [SB](14)
At [SA] = [SB] kcatKM provides a measure of substrate promiscuity efficiency
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 25 40
Goal 3 Protein Binding Affinity and Specificity
Proteins can bind to different partners
Protein-ligand binding interaction with a small molecule such as drug-target orenzyme-substrate
Protein-nucleotide (DNARNA) binding in transcription regulation promotersetc
Protein-protein interactionPermanent or obligated in multi-units proteins it could have a structural or functionalroleTransient in signaling transport and regulation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 26 40
31 Protein Binding Affinity
Dissociation constant
A + Bk1
kminus1
AB (15)
d [AB]
dt= k1[A][B]minus kminus1[AB] (16)
In equilibrium
0 = k1[A][B]minus kminus1[AB] (17)
kd =kminus1
k1=
[A][B]
[AB](18)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 27 40
31 Protein Binding Affinity
Affinity constant
ka =1kd
(19)
In antibodies
Ab + Agkforward
kback
AbAg (20)
Binding free energy
∆G = minusRT ln ka = minusRT ln1kd
(21)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 28 40
Simplified Thermodynamics of an Enzymatic Reaction
[Jonas and Hollfelder in Protein Engineering Handbook (2009)]
Ground-state binding (KM )
Transition-state binding (Ktx )
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 29 40
32 Protein Binding Specificity
These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation
Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners
KI inhibition constant When an inhibitor competes with a ligand
Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands
Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs
Promiscuity the ability to participate n a function other than the native one
Allostery regulation of a protein by binding of some ligand (the effector)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40
Thermodynamics of a Reaction with 2 Competing Substrates
[Desari and Miller in Protein Engineering Handbook (2009)]
Specificity reflects differences in the absolute heights of the transition states
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40
Introducing the Substitutions
Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point
mutation deletion or insertion) is annealed to the targetregion
4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template
5 The heteroduplex is propagated by transformation in Ecoli
Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40
Recombination and DNA-shuffling
A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology
DNA-shuffling to perform functionaldomain or motif shuffling in vitro
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40
Recombinant Protein Folding
E coli is a typically first choice for expressing a heterologous protein
However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli
Some misfolding-related issues
Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases
The environment (crowding pH osmolarity etc)
Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)
Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis
E coli expressing human leptin as
inclusion body
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40
Directed Evolution
A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold
Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities
An iterative process
Identifying a good starting sequence usually containing some level of latentpromiscuity
Creation of a library of variants
Selecting variants with improved function (mutation and screening)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40
From Natural Enzymes to Protein Engineeringto Computational Protein Design
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40
Computational Protein Design1 Challenges in Protein Engineering
Pablo Carbonellpablocarbonellissbgenopolefr
iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France
mSSB December 2010
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40
Bibliography I
David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232
Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013
Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174
D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]
Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359
James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40
The Michaelis Constant KM and the steady-state flux
Rate of product formation (flux)
d [P]
dt= v = k2[ES] = k2[E ]0
[S]
KM + [S](11)
v =vmax [S]
KM + [S]=
11 + KM
[S]
vmax (12)
KM can be measured as the concentration of substrate [S] that corresponds to aproduct formation yield half of the maximum
v =vmax
2(13)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 23 40
Determining KM from the concentration curve
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 24 40
Evaluating Enzyme Efficiency
kcatKM is often used as a specificity constant to compare relative enzyme ratesof reaction of pairs of substrates transformed by an enzyme
For an enzyme acting simultaneously on two substrates SA SB at rates vA vB
vA
vB=
kAcatK A
M [SA]
kBcatK B
M [SB](14)
At [SA] = [SB] kcatKM provides a measure of substrate promiscuity efficiency
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 25 40
Goal 3 Protein Binding Affinity and Specificity
Proteins can bind to different partners
Protein-ligand binding interaction with a small molecule such as drug-target orenzyme-substrate
Protein-nucleotide (DNARNA) binding in transcription regulation promotersetc
Protein-protein interactionPermanent or obligated in multi-units proteins it could have a structural or functionalroleTransient in signaling transport and regulation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 26 40
31 Protein Binding Affinity
Dissociation constant
A + Bk1
kminus1
AB (15)
d [AB]
dt= k1[A][B]minus kminus1[AB] (16)
In equilibrium
0 = k1[A][B]minus kminus1[AB] (17)
kd =kminus1
k1=
[A][B]
[AB](18)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 27 40
31 Protein Binding Affinity
Affinity constant
ka =1kd
(19)
In antibodies
Ab + Agkforward
kback
AbAg (20)
Binding free energy
∆G = minusRT ln ka = minusRT ln1kd
(21)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 28 40
Simplified Thermodynamics of an Enzymatic Reaction
[Jonas and Hollfelder in Protein Engineering Handbook (2009)]
Ground-state binding (KM )
Transition-state binding (Ktx )
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 29 40
32 Protein Binding Specificity
These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation
Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners
KI inhibition constant When an inhibitor competes with a ligand
Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands
Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs
Promiscuity the ability to participate n a function other than the native one
Allostery regulation of a protein by binding of some ligand (the effector)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40
Thermodynamics of a Reaction with 2 Competing Substrates
[Desari and Miller in Protein Engineering Handbook (2009)]
Specificity reflects differences in the absolute heights of the transition states
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40
Introducing the Substitutions
Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point
mutation deletion or insertion) is annealed to the targetregion
4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template
5 The heteroduplex is propagated by transformation in Ecoli
Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40
Recombination and DNA-shuffling
A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology
DNA-shuffling to perform functionaldomain or motif shuffling in vitro
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40
Recombinant Protein Folding
E coli is a typically first choice for expressing a heterologous protein
However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli
Some misfolding-related issues
Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases
The environment (crowding pH osmolarity etc)
Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)
Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis
E coli expressing human leptin as
inclusion body
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40
Directed Evolution
A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold
Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities
An iterative process
Identifying a good starting sequence usually containing some level of latentpromiscuity
Creation of a library of variants
Selecting variants with improved function (mutation and screening)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40
From Natural Enzymes to Protein Engineeringto Computational Protein Design
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40
Computational Protein Design1 Challenges in Protein Engineering
Pablo Carbonellpablocarbonellissbgenopolefr
iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France
mSSB December 2010
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40
Bibliography I
David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232
Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013
Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174
D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]
Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359
James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40
Determining KM from the concentration curve
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 24 40
Evaluating Enzyme Efficiency
kcatKM is often used as a specificity constant to compare relative enzyme ratesof reaction of pairs of substrates transformed by an enzyme
For an enzyme acting simultaneously on two substrates SA SB at rates vA vB
vA
vB=
kAcatK A
M [SA]
kBcatK B
M [SB](14)
At [SA] = [SB] kcatKM provides a measure of substrate promiscuity efficiency
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 25 40
Goal 3 Protein Binding Affinity and Specificity
Proteins can bind to different partners
Protein-ligand binding interaction with a small molecule such as drug-target orenzyme-substrate
Protein-nucleotide (DNARNA) binding in transcription regulation promotersetc
Protein-protein interactionPermanent or obligated in multi-units proteins it could have a structural or functionalroleTransient in signaling transport and regulation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 26 40
31 Protein Binding Affinity
Dissociation constant
A + Bk1
kminus1
AB (15)
d [AB]
dt= k1[A][B]minus kminus1[AB] (16)
In equilibrium
0 = k1[A][B]minus kminus1[AB] (17)
kd =kminus1
k1=
[A][B]
[AB](18)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 27 40
31 Protein Binding Affinity
Affinity constant
ka =1kd
(19)
In antibodies
Ab + Agkforward
kback
AbAg (20)
Binding free energy
∆G = minusRT ln ka = minusRT ln1kd
(21)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 28 40
Simplified Thermodynamics of an Enzymatic Reaction
[Jonas and Hollfelder in Protein Engineering Handbook (2009)]
Ground-state binding (KM )
Transition-state binding (Ktx )
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 29 40
32 Protein Binding Specificity
These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation
Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners
KI inhibition constant When an inhibitor competes with a ligand
Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands
Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs
Promiscuity the ability to participate n a function other than the native one
Allostery regulation of a protein by binding of some ligand (the effector)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40
Thermodynamics of a Reaction with 2 Competing Substrates
[Desari and Miller in Protein Engineering Handbook (2009)]
Specificity reflects differences in the absolute heights of the transition states
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40
Introducing the Substitutions
Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point
mutation deletion or insertion) is annealed to the targetregion
4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template
5 The heteroduplex is propagated by transformation in Ecoli
Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40
Recombination and DNA-shuffling
A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology
DNA-shuffling to perform functionaldomain or motif shuffling in vitro
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40
Recombinant Protein Folding
E coli is a typically first choice for expressing a heterologous protein
However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli
Some misfolding-related issues
Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases
The environment (crowding pH osmolarity etc)
Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)
Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis
E coli expressing human leptin as
inclusion body
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40
Directed Evolution
A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold
Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities
An iterative process
Identifying a good starting sequence usually containing some level of latentpromiscuity
Creation of a library of variants
Selecting variants with improved function (mutation and screening)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40
From Natural Enzymes to Protein Engineeringto Computational Protein Design
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40
Computational Protein Design1 Challenges in Protein Engineering
Pablo Carbonellpablocarbonellissbgenopolefr
iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France
mSSB December 2010
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40
Bibliography I
David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232
Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013
Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174
D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]
Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359
James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40
Evaluating Enzyme Efficiency
kcatKM is often used as a specificity constant to compare relative enzyme ratesof reaction of pairs of substrates transformed by an enzyme
For an enzyme acting simultaneously on two substrates SA SB at rates vA vB
vA
vB=
kAcatK A
M [SA]
kBcatK B
M [SB](14)
At [SA] = [SB] kcatKM provides a measure of substrate promiscuity efficiency
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 25 40
Goal 3 Protein Binding Affinity and Specificity
Proteins can bind to different partners
Protein-ligand binding interaction with a small molecule such as drug-target orenzyme-substrate
Protein-nucleotide (DNARNA) binding in transcription regulation promotersetc
Protein-protein interactionPermanent or obligated in multi-units proteins it could have a structural or functionalroleTransient in signaling transport and regulation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 26 40
31 Protein Binding Affinity
Dissociation constant
A + Bk1
kminus1
AB (15)
d [AB]
dt= k1[A][B]minus kminus1[AB] (16)
In equilibrium
0 = k1[A][B]minus kminus1[AB] (17)
kd =kminus1
k1=
[A][B]
[AB](18)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 27 40
31 Protein Binding Affinity
Affinity constant
ka =1kd
(19)
In antibodies
Ab + Agkforward
kback
AbAg (20)
Binding free energy
∆G = minusRT ln ka = minusRT ln1kd
(21)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 28 40
Simplified Thermodynamics of an Enzymatic Reaction
[Jonas and Hollfelder in Protein Engineering Handbook (2009)]
Ground-state binding (KM )
Transition-state binding (Ktx )
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 29 40
32 Protein Binding Specificity
These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation
Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners
KI inhibition constant When an inhibitor competes with a ligand
Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands
Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs
Promiscuity the ability to participate n a function other than the native one
Allostery regulation of a protein by binding of some ligand (the effector)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40
Thermodynamics of a Reaction with 2 Competing Substrates
[Desari and Miller in Protein Engineering Handbook (2009)]
Specificity reflects differences in the absolute heights of the transition states
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40
Introducing the Substitutions
Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point
mutation deletion or insertion) is annealed to the targetregion
4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template
5 The heteroduplex is propagated by transformation in Ecoli
Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40
Recombination and DNA-shuffling
A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology
DNA-shuffling to perform functionaldomain or motif shuffling in vitro
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40
Recombinant Protein Folding
E coli is a typically first choice for expressing a heterologous protein
However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli
Some misfolding-related issues
Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases
The environment (crowding pH osmolarity etc)
Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)
Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis
E coli expressing human leptin as
inclusion body
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40
Directed Evolution
A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold
Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities
An iterative process
Identifying a good starting sequence usually containing some level of latentpromiscuity
Creation of a library of variants
Selecting variants with improved function (mutation and screening)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40
From Natural Enzymes to Protein Engineeringto Computational Protein Design
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40
Computational Protein Design1 Challenges in Protein Engineering
Pablo Carbonellpablocarbonellissbgenopolefr
iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France
mSSB December 2010
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40
Bibliography I
David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232
Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013
Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174
D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]
Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359
James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40
Goal 3 Protein Binding Affinity and Specificity
Proteins can bind to different partners
Protein-ligand binding interaction with a small molecule such as drug-target orenzyme-substrate
Protein-nucleotide (DNARNA) binding in transcription regulation promotersetc
Protein-protein interactionPermanent or obligated in multi-units proteins it could have a structural or functionalroleTransient in signaling transport and regulation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 26 40
31 Protein Binding Affinity
Dissociation constant
A + Bk1
kminus1
AB (15)
d [AB]
dt= k1[A][B]minus kminus1[AB] (16)
In equilibrium
0 = k1[A][B]minus kminus1[AB] (17)
kd =kminus1
k1=
[A][B]
[AB](18)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 27 40
31 Protein Binding Affinity
Affinity constant
ka =1kd
(19)
In antibodies
Ab + Agkforward
kback
AbAg (20)
Binding free energy
∆G = minusRT ln ka = minusRT ln1kd
(21)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 28 40
Simplified Thermodynamics of an Enzymatic Reaction
[Jonas and Hollfelder in Protein Engineering Handbook (2009)]
Ground-state binding (KM )
Transition-state binding (Ktx )
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 29 40
32 Protein Binding Specificity
These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation
Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners
KI inhibition constant When an inhibitor competes with a ligand
Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands
Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs
Promiscuity the ability to participate n a function other than the native one
Allostery regulation of a protein by binding of some ligand (the effector)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40
Thermodynamics of a Reaction with 2 Competing Substrates
[Desari and Miller in Protein Engineering Handbook (2009)]
Specificity reflects differences in the absolute heights of the transition states
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40
Introducing the Substitutions
Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point
mutation deletion or insertion) is annealed to the targetregion
4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template
5 The heteroduplex is propagated by transformation in Ecoli
Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40
Recombination and DNA-shuffling
A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology
DNA-shuffling to perform functionaldomain or motif shuffling in vitro
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40
Recombinant Protein Folding
E coli is a typically first choice for expressing a heterologous protein
However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli
Some misfolding-related issues
Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases
The environment (crowding pH osmolarity etc)
Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)
Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis
E coli expressing human leptin as
inclusion body
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40
Directed Evolution
A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold
Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities
An iterative process
Identifying a good starting sequence usually containing some level of latentpromiscuity
Creation of a library of variants
Selecting variants with improved function (mutation and screening)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40
From Natural Enzymes to Protein Engineeringto Computational Protein Design
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40
Computational Protein Design1 Challenges in Protein Engineering
Pablo Carbonellpablocarbonellissbgenopolefr
iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France
mSSB December 2010
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40
Bibliography I
David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232
Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013
Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174
D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]
Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359
James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40
31 Protein Binding Affinity
Dissociation constant
A + Bk1
kminus1
AB (15)
d [AB]
dt= k1[A][B]minus kminus1[AB] (16)
In equilibrium
0 = k1[A][B]minus kminus1[AB] (17)
kd =kminus1
k1=
[A][B]
[AB](18)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 27 40
31 Protein Binding Affinity
Affinity constant
ka =1kd
(19)
In antibodies
Ab + Agkforward
kback
AbAg (20)
Binding free energy
∆G = minusRT ln ka = minusRT ln1kd
(21)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 28 40
Simplified Thermodynamics of an Enzymatic Reaction
[Jonas and Hollfelder in Protein Engineering Handbook (2009)]
Ground-state binding (KM )
Transition-state binding (Ktx )
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 29 40
32 Protein Binding Specificity
These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation
Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners
KI inhibition constant When an inhibitor competes with a ligand
Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands
Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs
Promiscuity the ability to participate n a function other than the native one
Allostery regulation of a protein by binding of some ligand (the effector)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40
Thermodynamics of a Reaction with 2 Competing Substrates
[Desari and Miller in Protein Engineering Handbook (2009)]
Specificity reflects differences in the absolute heights of the transition states
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40
Introducing the Substitutions
Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point
mutation deletion or insertion) is annealed to the targetregion
4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template
5 The heteroduplex is propagated by transformation in Ecoli
Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40
Recombination and DNA-shuffling
A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology
DNA-shuffling to perform functionaldomain or motif shuffling in vitro
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40
Recombinant Protein Folding
E coli is a typically first choice for expressing a heterologous protein
However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli
Some misfolding-related issues
Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases
The environment (crowding pH osmolarity etc)
Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)
Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis
E coli expressing human leptin as
inclusion body
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40
Directed Evolution
A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold
Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities
An iterative process
Identifying a good starting sequence usually containing some level of latentpromiscuity
Creation of a library of variants
Selecting variants with improved function (mutation and screening)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40
From Natural Enzymes to Protein Engineeringto Computational Protein Design
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40
Computational Protein Design1 Challenges in Protein Engineering
Pablo Carbonellpablocarbonellissbgenopolefr
iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France
mSSB December 2010
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40
Bibliography I
David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232
Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013
Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174
D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]
Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359
James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40
31 Protein Binding Affinity
Affinity constant
ka =1kd
(19)
In antibodies
Ab + Agkforward
kback
AbAg (20)
Binding free energy
∆G = minusRT ln ka = minusRT ln1kd
(21)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 28 40
Simplified Thermodynamics of an Enzymatic Reaction
[Jonas and Hollfelder in Protein Engineering Handbook (2009)]
Ground-state binding (KM )
Transition-state binding (Ktx )
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 29 40
32 Protein Binding Specificity
These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation
Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners
KI inhibition constant When an inhibitor competes with a ligand
Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands
Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs
Promiscuity the ability to participate n a function other than the native one
Allostery regulation of a protein by binding of some ligand (the effector)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40
Thermodynamics of a Reaction with 2 Competing Substrates
[Desari and Miller in Protein Engineering Handbook (2009)]
Specificity reflects differences in the absolute heights of the transition states
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40
Introducing the Substitutions
Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point
mutation deletion or insertion) is annealed to the targetregion
4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template
5 The heteroduplex is propagated by transformation in Ecoli
Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40
Recombination and DNA-shuffling
A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology
DNA-shuffling to perform functionaldomain or motif shuffling in vitro
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40
Recombinant Protein Folding
E coli is a typically first choice for expressing a heterologous protein
However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli
Some misfolding-related issues
Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases
The environment (crowding pH osmolarity etc)
Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)
Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis
E coli expressing human leptin as
inclusion body
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40
Directed Evolution
A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold
Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities
An iterative process
Identifying a good starting sequence usually containing some level of latentpromiscuity
Creation of a library of variants
Selecting variants with improved function (mutation and screening)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40
From Natural Enzymes to Protein Engineeringto Computational Protein Design
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40
Computational Protein Design1 Challenges in Protein Engineering
Pablo Carbonellpablocarbonellissbgenopolefr
iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France
mSSB December 2010
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40
Bibliography I
David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232
Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013
Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174
D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]
Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359
James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40
Simplified Thermodynamics of an Enzymatic Reaction
[Jonas and Hollfelder in Protein Engineering Handbook (2009)]
Ground-state binding (KM )
Transition-state binding (Ktx )
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 29 40
32 Protein Binding Specificity
These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation
Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners
KI inhibition constant When an inhibitor competes with a ligand
Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands
Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs
Promiscuity the ability to participate n a function other than the native one
Allostery regulation of a protein by binding of some ligand (the effector)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40
Thermodynamics of a Reaction with 2 Competing Substrates
[Desari and Miller in Protein Engineering Handbook (2009)]
Specificity reflects differences in the absolute heights of the transition states
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40
Introducing the Substitutions
Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point
mutation deletion or insertion) is annealed to the targetregion
4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template
5 The heteroduplex is propagated by transformation in Ecoli
Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40
Recombination and DNA-shuffling
A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology
DNA-shuffling to perform functionaldomain or motif shuffling in vitro
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40
Recombinant Protein Folding
E coli is a typically first choice for expressing a heterologous protein
However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli
Some misfolding-related issues
Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases
The environment (crowding pH osmolarity etc)
Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)
Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis
E coli expressing human leptin as
inclusion body
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40
Directed Evolution
A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold
Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities
An iterative process
Identifying a good starting sequence usually containing some level of latentpromiscuity
Creation of a library of variants
Selecting variants with improved function (mutation and screening)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40
From Natural Enzymes to Protein Engineeringto Computational Protein Design
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40
Computational Protein Design1 Challenges in Protein Engineering
Pablo Carbonellpablocarbonellissbgenopolefr
iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France
mSSB December 2010
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40
Bibliography I
David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232
Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013
Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174
D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]
Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359
James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40
32 Protein Binding Specificity
These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation
Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners
KI inhibition constant When an inhibitor competes with a ligand
Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands
Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs
Promiscuity the ability to participate n a function other than the native one
Allostery regulation of a protein by binding of some ligand (the effector)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40
Thermodynamics of a Reaction with 2 Competing Substrates
[Desari and Miller in Protein Engineering Handbook (2009)]
Specificity reflects differences in the absolute heights of the transition states
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40
Introducing the Substitutions
Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point
mutation deletion or insertion) is annealed to the targetregion
4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template
5 The heteroduplex is propagated by transformation in Ecoli
Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40
Recombination and DNA-shuffling
A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology
DNA-shuffling to perform functionaldomain or motif shuffling in vitro
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40
Recombinant Protein Folding
E coli is a typically first choice for expressing a heterologous protein
However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli
Some misfolding-related issues
Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases
The environment (crowding pH osmolarity etc)
Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)
Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis
E coli expressing human leptin as
inclusion body
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40
Directed Evolution
A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold
Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities
An iterative process
Identifying a good starting sequence usually containing some level of latentpromiscuity
Creation of a library of variants
Selecting variants with improved function (mutation and screening)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40
From Natural Enzymes to Protein Engineeringto Computational Protein Design
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40
Computational Protein Design1 Challenges in Protein Engineering
Pablo Carbonellpablocarbonellissbgenopolefr
iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France
mSSB December 2010
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40
Bibliography I
David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232
Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013
Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174
D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]
Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359
James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40
Thermodynamics of a Reaction with 2 Competing Substrates
[Desari and Miller in Protein Engineering Handbook (2009)]
Specificity reflects differences in the absolute heights of the transition states
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40
Introducing the Substitutions
Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point
mutation deletion or insertion) is annealed to the targetregion
4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template
5 The heteroduplex is propagated by transformation in Ecoli
Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40
Recombination and DNA-shuffling
A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology
DNA-shuffling to perform functionaldomain or motif shuffling in vitro
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40
Recombinant Protein Folding
E coli is a typically first choice for expressing a heterologous protein
However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli
Some misfolding-related issues
Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases
The environment (crowding pH osmolarity etc)
Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)
Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis
E coli expressing human leptin as
inclusion body
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40
Directed Evolution
A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold
Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities
An iterative process
Identifying a good starting sequence usually containing some level of latentpromiscuity
Creation of a library of variants
Selecting variants with improved function (mutation and screening)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40
From Natural Enzymes to Protein Engineeringto Computational Protein Design
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40
Computational Protein Design1 Challenges in Protein Engineering
Pablo Carbonellpablocarbonellissbgenopolefr
iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France
mSSB December 2010
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40
Bibliography I
David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232
Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013
Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174
D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]
Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359
James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40
Introducing the Substitutions
Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point
mutation deletion or insertion) is annealed to the targetregion
4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template
5 The heteroduplex is propagated by transformation in Ecoli
Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40
Recombination and DNA-shuffling
A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology
DNA-shuffling to perform functionaldomain or motif shuffling in vitro
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40
Recombinant Protein Folding
E coli is a typically first choice for expressing a heterologous protein
However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli
Some misfolding-related issues
Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases
The environment (crowding pH osmolarity etc)
Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)
Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis
E coli expressing human leptin as
inclusion body
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40
Directed Evolution
A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold
Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities
An iterative process
Identifying a good starting sequence usually containing some level of latentpromiscuity
Creation of a library of variants
Selecting variants with improved function (mutation and screening)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40
From Natural Enzymes to Protein Engineeringto Computational Protein Design
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40
Computational Protein Design1 Challenges in Protein Engineering
Pablo Carbonellpablocarbonellissbgenopolefr
iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France
mSSB December 2010
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40
Bibliography I
David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232
Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013
Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174
D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]
Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359
James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40
Introducing the Substitutions
Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point
mutation deletion or insertion) is annealed to the targetregion
4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template
5 The heteroduplex is propagated by transformation in Ecoli
Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40
Recombination and DNA-shuffling
A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology
DNA-shuffling to perform functionaldomain or motif shuffling in vitro
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40
Recombinant Protein Folding
E coli is a typically first choice for expressing a heterologous protein
However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli
Some misfolding-related issues
Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases
The environment (crowding pH osmolarity etc)
Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)
Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis
E coli expressing human leptin as
inclusion body
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40
Directed Evolution
A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold
Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities
An iterative process
Identifying a good starting sequence usually containing some level of latentpromiscuity
Creation of a library of variants
Selecting variants with improved function (mutation and screening)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40
From Natural Enzymes to Protein Engineeringto Computational Protein Design
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40
Computational Protein Design1 Challenges in Protein Engineering
Pablo Carbonellpablocarbonellissbgenopolefr
iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France
mSSB December 2010
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40
Bibliography I
David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232
Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013
Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174
D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]
Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359
James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40
Outline
1 The Protein Design Cycle
2 Locating the Substitutions
3 Types of Protein Interactions
4 Engineering Protein Activity
5 Introducing the Substitutions
6 Screening and Library Creation
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40
Recombination and DNA-shuffling
A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology
DNA-shuffling to perform functionaldomain or motif shuffling in vitro
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40
Recombinant Protein Folding
E coli is a typically first choice for expressing a heterologous protein
However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli
Some misfolding-related issues
Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases
The environment (crowding pH osmolarity etc)
Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)
Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis
E coli expressing human leptin as
inclusion body
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40
Directed Evolution
A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold
Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities
An iterative process
Identifying a good starting sequence usually containing some level of latentpromiscuity
Creation of a library of variants
Selecting variants with improved function (mutation and screening)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40
From Natural Enzymes to Protein Engineeringto Computational Protein Design
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40
Computational Protein Design1 Challenges in Protein Engineering
Pablo Carbonellpablocarbonellissbgenopolefr
iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France
mSSB December 2010
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40
Bibliography I
David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232
Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013
Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174
D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]
Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359
James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40
Recombination and DNA-shuffling
A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology
DNA-shuffling to perform functionaldomain or motif shuffling in vitro
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40
Recombinant Protein Folding
E coli is a typically first choice for expressing a heterologous protein
However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli
Some misfolding-related issues
Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases
The environment (crowding pH osmolarity etc)
Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)
Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis
E coli expressing human leptin as
inclusion body
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40
Directed Evolution
A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold
Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities
An iterative process
Identifying a good starting sequence usually containing some level of latentpromiscuity
Creation of a library of variants
Selecting variants with improved function (mutation and screening)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40
From Natural Enzymes to Protein Engineeringto Computational Protein Design
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40
Computational Protein Design1 Challenges in Protein Engineering
Pablo Carbonellpablocarbonellissbgenopolefr
iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France
mSSB December 2010
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40
Bibliography I
David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232
Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013
Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174
D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]
Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359
James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40
Recombinant Protein Folding
E coli is a typically first choice for expressing a heterologous protein
However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli
Some misfolding-related issues
Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases
The environment (crowding pH osmolarity etc)
Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)
Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis
E coli expressing human leptin as
inclusion body
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40
Directed Evolution
A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold
Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities
An iterative process
Identifying a good starting sequence usually containing some level of latentpromiscuity
Creation of a library of variants
Selecting variants with improved function (mutation and screening)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40
From Natural Enzymes to Protein Engineeringto Computational Protein Design
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40
Computational Protein Design1 Challenges in Protein Engineering
Pablo Carbonellpablocarbonellissbgenopolefr
iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France
mSSB December 2010
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40
Bibliography I
David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232
Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013
Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174
D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]
Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359
James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40
Directed Evolution
A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold
Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities
An iterative process
Identifying a good starting sequence usually containing some level of latentpromiscuity
Creation of a library of variants
Selecting variants with improved function (mutation and screening)
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40
From Natural Enzymes to Protein Engineeringto Computational Protein Design
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40
Computational Protein Design1 Challenges in Protein Engineering
Pablo Carbonellpablocarbonellissbgenopolefr
iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France
mSSB December 2010
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40
Bibliography I
David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232
Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013
Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174
D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]
Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359
James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40
From Natural Enzymes to Protein Engineeringto Computational Protein Design
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40
Computational Protein Design1 Challenges in Protein Engineering
Pablo Carbonellpablocarbonellissbgenopolefr
iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France
mSSB December 2010
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40
Bibliography I
David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232
Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013
Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174
D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]
Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359
James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40
Computational Protein Design1 Challenges in Protein Engineering
Pablo Carbonellpablocarbonellissbgenopolefr
iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France
mSSB December 2010
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40
Bibliography I
David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232
Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013
Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174
D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]
Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359
James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40
Bibliography I
David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232
Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013
Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174
D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]
Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359
James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007
Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40