Symmetry in Protein Design - Hampton Research · solution to the protein crystallization problem...
Transcript of Symmetry in Protein Design - Hampton Research · solution to the protein crystallization problem...
Symmetry Ideas in Protein Assembly
natural designed
1000 Å 100 Å
accidental
RAMC 2013
Giant Biological Protein Assemblies –
Bacterial Microcompartments
Virus-sized protein capsids inside many
bacteria, encapsulating series of
enzymes and functioning as simple
metabolic organelles
Structural studies illuminate key
assembly and molecular transport
mechanisms
Kerfeld, et al. (2005) Science 309, 936-8; Tanaka, et al. (2008) Science 319, 1083-6; Tanaka, et al. (2010)
Science 327, 81-4; Yeates, Thompson, & Bobik (2011) Curr. Opin. Struct. Biol. 2, 223.
One of the most fascinating (but
mostly overlooked) puzzles in
structural biology:
Proteins show a striking
preference to self-assemble in
certain particular symmetries.
Of the 65 possible 3D space
groups, only a handful are
commonly obtained; one is
dominant. Top 1: ~33%
Bottom 55: 20%
‘Accidental’ assemblies:
The space group preference problem
Top 1: ~33%
Bottom 55: 20%
The space group preference
problem
The differences in probability span more than 2 orders of
magnitude, yet there are no obvious energetic explanations.
Yeates and Kent (2012). Annu. Rev. Biophys.
The space group preference
problem
• Is there a statistical rather than energetic explanation?
Are some space groups simply easier to achieve?
• How many different ‘ways’ can a protein molecule form
crystals in a given space group?
• Given the continuous range of possible molecular
orientations and positions, the number of distinct
crystalline arrangements is evidently infinite.
…
The space group preference
problem
• There are different kinds of
infinities.
• Suppose each possible
configuration (i.e. orientations
and positions) of a set of
molecules could be described
as a point within some (high
dimensional) space.
• What would the ‘solution space’
look like for each space group
within this high dimensional
space?
6N dimensional space to
describe all possible
orientations and
positions of N molecules
Some points in the space
must correspond to space
group P1.
Some points will represent
space group P2, etc.
Most points will not represent
crystalline arrangements
The space group preference
problem
A hypothetical 1-D
solution space
A hypothetical 2-D
solution space
An infinite number of solutions
falling on a 1-D curve
Differing from each other by a
change (forwards or back) in a
single direction
An infinite number of solutions
falling on a 2-D subspace
Differing from each other by a
change (forwards or back) in a
combination of 2 directions
Protein Crystal Space Group Preferences: Degrees of Freedom Theory for Characterizing the
Dimensionality of Different Space Groups
Wukovitz and Yeates (1995). Why protein crystals favour some space-groups
over others. Nat Struct Biol. 2, 1062-7.
Protein Crystal Space Group Preferences: Degrees of Freedom Theory for Characterizing the
Dimensionality of Different Space Groups
Examining a case (p2mm) where the answer is (sort of) obvious
Protein Crystal Space Group Preferences: Degrees of Freedom Theory for Characterizing the
Dimensionality of Different Space Groups
Examining a case (p2mm) where the answer is (sort of) obvious
Free to change orientation of molecule.
Protein Crystal Space Group Preferences: Degrees of Freedom Theory for Characterizing the
Dimensionality of Different Space Groups
Examining a case (p2mm) where the answer is (sort of) obvious
Free to change orientation of molecule.
But the rest of the crystal is fully defined
thereafter.
Therefore, D=1 for p2mm
Rot & Trans (S)
rotation
trans x
trans y
Unit Cell (L)
a axis
b axis
gamma --
Contacts (req.)
Degrees of freedom for p2mm
Counting free variables
and constraints
Rot & Trans (S)
rotation yes
trans x
trans y
Unit Cell (L)
a axis
b axis
gamma --
Contacts (req.)
Degrees of freedom for p2mm
Counting free variables
and constraints
Rot & Trans (S)
rotation yes
trans x
trans y
Unit Cell (L)
a axis
b axis
gamma --
Contacts (req.)
Degrees of freedom for p2mm
Counting free variables
and constraints
Rot & Trans (S)
rotation yes
trans x yes
trans y
Unit Cell (L)
a axis
b axis
gamma --
Contacts (req.)
Degrees of freedom for p2mm
Counting free variables
and constraints
Rot & Trans (S)
rotation yes
trans x yes
trans y
Unit Cell (L)
a axis
b axis
gamma --
Contacts (req.)
Degrees of freedom for p2mm
Counting free variables
and constraints
Rot & Trans (S)
rotation yes
trans x yes
trans y yes
Unit Cell (L)
a axis
b axis
gamma --
Contacts (req.)
Degrees of freedom for p2mm
Counting free variables
and constraints
Rot & Trans (S)
rotation yes
trans x yes
trans y yes
Unit Cell (L)
a axis
b axis
gamma --
Contacts (req.)
Degrees of freedom for p2mm
Counting free variables
and constraints
Rot & Trans (S)
rotation yes
trans x yes
trans y yes
Unit Cell (L)
a axis yes
b axis
gamma --
Contacts (req.)
Degrees of freedom for p2mm
Counting free variables
and constraints
Rot & Trans (S)
rotation yes
trans x yes
trans y yes
Unit Cell (L)
a axis yes
b axis
gamma --
Contacts (req.)
Degrees of freedom for p2mm
Counting free variables
and constraints
Rot & Trans (S)
rotation yes
trans x yes
trans y yes
Unit Cell (L)
a axis yes
b axis yes
gamma --
Contacts (req.)
Degrees of freedom for p2mm
Counting free variables
and constraints
Rot & Trans (S)
rotation yes
trans x yes
trans y yes
Unit Cell (L)
a axis yes
b axis yes
gamma --
Contacts (req.)
Degrees of freedom for p2mm
Counting free variables
and constraints
Rot & Trans (S)
rotation yes
trans x yes
trans y yes
Unit Cell (L)
a axis yes
b axis yes
gamma --
Contacts (req.)
Degrees of freedom for p2mm
Counting free variables
and constraints
Degrees of freedom for p2mm
D = S + L – C
= 3 + 2 – 4 = 1
Rot & Trans (S)
rotation yes
trans x yes
trans y yes
Unit Cell (L)
a axis yes
b axis yes
gamma --
Contacts (req.) 4
Counting free variables
and constraints
Degrees of freedom for p1
Rot & Trans (S)
rotation
trans x
trans y
Unit Cell (L)
a axis
b axis
gamma
Contacts (req.)
Degrees of freedom for p1
Rot & Trans (S)
rotation yes
trans x
trans y
Unit Cell (L)
a axis
b axis
gamma
Contacts (req.)
Degrees of freedom for p1
Rot & Trans (S)
rotation yes
trans x
trans y
Unit Cell (L)
a axis
b axis
gamma
Contacts (req.)
Degrees of freedom for p1
Rot & Trans (S)
rotation yes
trans x no
trans y no
Unit Cell (L)
a axis
b axis
gamma
Contacts (req.)
Degrees of freedom for p1
Rot & Trans (S)
rotation yes
trans x no
trans y no
Unit Cell (L)
a axis
b axis
gamma
Contacts (req.)
Degrees of freedom for p1
Rot & Trans (S)
rotation yes
trans x no
trans y no
Unit Cell (L)
a axis yes
b axis
gamma
Contacts (req.)
Degrees of freedom for p1
Rot & Trans (S)
rotation yes
trans x no
trans y no
Unit Cell (L)
a axis yes
b axis
gamma
Contacts (req.)
Degrees of freedom for p1
Rot & Trans (S)
rotation yes
trans x no
trans y no
Unit Cell (L)
a axis yes
b axis yes
gamma
Contacts (req.)
Degrees of freedom for p1
Rot & Trans (S)
rotation yes
trans x no
trans y no
Unit Cell (L)
a axis yes
b axis yes
gamma
Contacts (req.)
Degrees of freedom for p1
Rot & Trans (S)
rotation yes
trans x no
trans y no
Unit Cell (L)
a axis yes
b axis yes
gamma yes
Contacts (req.)
Degrees of freedom for p1
Rot & Trans (S)
rotation yes
trans x no
trans y no
Unit Cell (L)
a axis yes
b axis yes
gamma yes
Contacts (req.)
Degrees of freedom for p1
Rot & Trans (S)
rotation yes
trans x no
trans y no
Unit Cell (L)
a axis yes
b axis yes
gamma yes
Contacts (req.)
Degrees of freedom for p1
Rot & Trans (S)
rotation yes
trans x no
trans y no
Unit Cell (L)
a axis yes
b axis yes
gamma yes
Contacts (req.) 2
D = S + L – C
= 1 + 3 – 2 = 2
Rot & Trans (S)
rotation yes
trans x yes
trans y yes
Unit Cell (L)
a axis yes
b axis yes
gamma yes
Contacts (req.) 3
D = S + L – C
= 3 + 3 – 3 = 3
Degrees of freedom for p2
The Minimum Contact Number, C A mathematical description of the constraints
implied by molecular connectivity
C is a property of the
mathematical group, not the
molecule Wukovitz & Yeates, Nat. Struct. Biol. 2, 1062 (1995)
The Minimum Contact Number, C A mathematical description of the constraints
implied by molecular connectivity
C is a property of the
mathematical group, not the
molecule Wukovitz & Yeates, Nat. Struct. Biol. 2, 1062 (1995)
The Minimum Contact Number, C A mathematical description of the constraints
implied by molecular connectivity
C is a property of the
mathematical group, not the
molecule Wukovitz & Yeates, Nat. Struct. Biol. 2, 1062 (1995)
The Minimum Contact Number, C A mathematical description of the constraints
implied by molecular connectivity
C is a property of the
mathematical group, not the
molecule Wukovitz & Yeates, Nat. Struct. Biol. 2, 1062 (1995)
The Minimum Contact Number, C A mathematical description of the constraints
implied by molecular connectivity
C is a property of the
mathematical group, not the
molecule Wukovitz & Yeates, Nat. Struct. Biol. 2, 1062 (1995)
The Minimum Contact Number, C A mathematical description of the constraints
implied by molecular connectivity
C is a property of the
mathematical group, not the
molecule Wukovitz & Yeates, Nat. Struct. Biol. 2, 1062 (1995)
The minimum contact number, C,
dictates (in part) the number of
degrees of freedom, D, available for
constructing any given symmetrical
arrangement.
D can be calculated for the possible
65 space group symmetries.
Symmetries with more degrees of
freedom are expected to occur
more frequently for statistical
(rather than energetic) reasons.
Wukovitz & Yeates, Nat. Struct. Biol. 2, 1062 (1995)
D = S + L – C
The Minimum Contact Number, C Implications for the space group preference
problem
Agreement between the dimensionality for
forming different space groups and their
observed frequencies
• Only one space group, P212121, which dominates in macromolecular
crystals, has D=7 !!
• A dimensionality analysis explains most of the observed phenomenon.
D=7
D=6
D=5
D=4
• The 65 possible space group symmetries fall into 4 categories of
increasing likelihood: D= 4, 5, 6, 7 (factor of ~8 for each increment in D)
Various ideas emerging from the
minimum contact number work
1. A surprise in the achiral space
groups!! – revolution or curiosity?
2. Ability to form intermolecular
contacts consistent with
crystallographic symmetry limits
crystallization. Leads to new
approaches for crystallizing proteins.
3. An astonishing range of
architectures and symmetries can be
generated using only two contact
types (e.g. symmetric points of
protein-protein interaction). Leads to
a general approach for designing
protein assemblies.
So what?
1. A Surprising Discovery
• 65 ‘biological’ space groups
D=7 (1 space group, P212121)
D=6 (13 space groups)
D=5 (42 space groups)
D=4 (9 space groups)
International
Tables for
Crystallography
International
Tables for
Crystallography
• 65 ‘biological’ space groups
D=7 (1 space group, P212121)
D=6 (13 space groups)
D=5 (42 space groups)
D=4 (9 space groups)
• 165 ‘non-biological’ space groups
D=8 !!! (1 space group, P1(bar))
D=7 (2 space groups: P21/c, C2/c)
D < 6, 5, 4,… (162 space groups)
International
Tables for
Crystallography
1. A Surprising Discovery
International
Tables for
Crystallography
• ‘Non-biological’ space groups require racemic protein mixtures
(i.e. mirror image proteins synthesized from D-amino acids).
• This opens up a powerful approach for protein crystallization.
Why P1(bar) is the super-winner
• Inversion centers give a
well-defined origin, so
S=6 (3 rotations and 3
translations)
• Unit cell is triclinic, so 6
variable choices, L=6
• 4 unique contacts
required for connectivity;
one to connect the L and
D molecules, and three
for translational
connectivity along three
directions, C=4
D = S + L – C = 6 + 6 - 4 = 8
Mirror image proteins provide a potentially powerful
solution to the protein crystallization problem
Predictions from theory
• Proteins will crystallize much
more easily if they can be
prepared as a racemic mixture;
this requires chemical synthesis
of the mirror image protein (i.e.
from D-amino acids)
• P1(bar) will dominate for
racemic crystallization of
proteins; this highly specific
prediction provides a powerful
test of the theoretical ideas Yeates and Kent (2012). Annu. Rev. Biophys.
‘macromolecule’ space group
rubredoxin P1 (bar)
leu-enkephalin P1 (bar)
d(CGCGCG) P1 (bar)
trichogin A P1 (bar)
a-1 (designed peptide) P1 (bar)
monellin (sweet protein) P1 (bar)*
Racemic ‘macromolecule’ crystal data available
by the late ’90’s
Racemic Protein
Crystals based on
new synthetic
methods:
Native chemical
ligation - Stephen
Kent and
Collaborators
Dawson PE, Kent SB.
Annu. Rev. Biochem.
2000;69:923-60.
• Banigan JR, Mandal K, Sawaya MR,
Thammavongsa V, Hendrickx AP, et al.2010.
Protein Sci 19: 1840-49.
• Pentelute BL, Mandal K, Gates ZP, Sawaya
MR, Yeates TO, Kent SB. Chem Commun
(Camb.) 46 :8174-6.
• Sawaya, et al. Single Wavelength Phasing
Strategy for Quasi-Racemic Protein Crystal
Diffraction Data. Acta Cryst D.
• Yeates and Kent, (2011). Ann. Rev. Biophys.
Current statistics for racemic protein crystals
• Amazingly good agreement
with predictions
• A few cases where the
racemate crystallized easily
and the single enantiomer
did not, but data still
somewhat anecdotal
• No obvious trend in
resolution improvement
• Methods have not been
tested systematically in any
situation where the single
enantiomer gave only limited
resolution
N.B. Essentially no tendency observed so far
to resolve into chiral space groups
• Yeates and Kent, (2011). Ann. Rev. Biophys.
Phasing Considerations for Racemic
Crystallography
Centrosymmetric Space Groups
Acentric vs. Centric Phases
N.B. The space groups
predicted to be preferred by
racemic protein crystals
(P1(bar), P21/c, and C2/c) are all
centrosymmetric.
Implications of Racemic Protein
Crystallography
• Crystallization of racemic proteins
might be 3 or 4 times easier than
crystallizing ordinary (chiral)
proteins
• A potentially powerful solution to
the major bottleneck in
macromolecular crystallography,
especially if size and cost barriers
to synthesis can be lowered
• Potentially interesting phasing
strategies: e.g., single wavelength,
in-house, iodo-tyrosine quasi-
racemic.
2. Making Proteins (and Nucleic Acids) More
Amenable to Crystallization: ‘Synthetic
Symmetrization’
Symmetry can be built into an
otherwise asymmetric molecule
(e.g. engineered cysteines).
Two key features:
• Symmetry improves
crystallization chances
somewhat (~50%) based on
database analysis
• A given macromolecule can be
dimerized in multiple different
ways, giving rise to multiple
entirely distinct chances for
crystallization
Successful results using
disulfide-based synthetic
symmetrization in a
model protein system (T4
lysozyme)
Banatao, et al. (2006). PNAS 103, 16230-5.
6 new crystal forms of lysozyme
Successful results using disulfide-based synthetic
symmetrization to crystallize a new protein
Forse, et al. (2011) Prot. Sci. 20, 168-78
Cel A endoglucanase from T.
maritima
Successful results
using metal-based
synthetic
symmetrization in a
model protein system
(T4 lysozyme)
Laganowski, Zhao, Soriaga, et al.
(2011) Prot. Sci. 21, 1876-90.
Successful results
using metal-based
synthetic
symmetrization in a
model protein system
(MBP)
Laganowski, Zhao, Soriaga, et al.
(2011) Prot. Sci. 21, 1876-90.
GFP 1-9
Target N
C
Permissive
Loop S11 S10
Target N
C
GFP
Target N
C
A Facile Strategy for Applying Synthetic
Symmetrization:
Factoring out the protein engineering component
Part 1:
method of
attachment to a
carrier protein,
split GFP
Terminal fusion, or
loop insertion,
which gives two
chain crossings!
Hau, et al. (in press)
D102
D173 K26
D190
Q157 D117
Part 2. Synthetic symmetrization of the carrier protein, GFP
10-11 Hairpin
Purification summary:
- Ni2+ IMAC in non-reducing conditions - disulfide formation with CuSO4 at pH 9.0 - ion-exchange to separate species
Surface exposed charged residues selected.
- opposite face to the 10-11 hairpin
- ends of the β-strands or in loops to allow
the disulfide to form
- mutations made in a ‘Cys-less’ GFP backbone
K26C
D190C
Monomer
Dimer
Sites for single
cysteine insertion (into
cysteine free
background)
Crystallizability of Synthetically Dimerized GFPs
Mutant Cloned Dimer
s
Crystal
s
Xtal
Conditions#
Unique
Structures
Space Groups Resolution
K26C ✔ ✔ ✔ ~5 2 P 3221, P 212121 1.9Å - 3.2Å
D102C ✔ ✔ ✔ 30+ 4 P 1, P 212121 3.1Å -3.6Å
D117C ✔ ✔ ✔ 10+ 5 P 63, P 6422, P 3121,
P 4122, I 4122
1.7Å – 2.9Å
D173C ✗ ✗ ✗ ✗ ✗ ✗
Q157C ✔ ✔ ✔ 2 ✗ ✗
D190C ✔ ✔ ✔ ~10 2 P 212121, P 61 2.7Å – 3.1Å
~60+ 13
All dimers xtal screens in 5 commercial screens each (PACT, JCSG+, SaltRX, Wizard, CS 1+2), each mutant tends to
crystalize in
different conditions
# Non-duplicate conditions with crystals from Mosquito trays
D102C readily forms plate crystals in conditions containing PEGs, solved structures are in P1 with unique arrangements of
dimers in the
asymmetric unit
Diverse Arrangements of Dimerized GFPs
K26C
1.9Å
P 21 21 21
K26C
3.2Å
P 32 2 1
D102C 3.1-3.6Å
P 1
D102C 3.1Å
P 21 21 21
D117C 1.7-2.9Å
All crystals
D190C 2.65-3.1Å
both crystals
Six distinct arrangements of the GFP dimer have been demonstrated
(refined) so far, and several more are in process. These will constitute a
suite of independent partners for crystallization.
Combining just two contact types can give rise to
complex architectures
4-fold contact
2-fold contact
3. Extension of the Symmetric Contact Idea to
a Strategy for Designing Self-Assembling
Protein Materials
3. Extension of the Symmetric Contact Idea to
a Strategy for Designing Self-Assembling
Protein Materials
• Natural oligomeric (e.g.
dimeric and trimeric)
proteins can serve as the
building blocks
• Fusing two such proteins
together (e.g. by genetic
engineering) provides the
two interactions needed for a
rich variety of designs
Natural protein dimers
and trimers – building
blocks for designed self-
assembly
A General Method
for Designing
Self-Assembling
Protein Materials
Padilla, Colovos,
& Yeates, PNAS,
98, 2217 (2001)
• Fusion of two simple
oligomers (e.g. dimer
+ trimer)
• Outcome dictated by
geometry of axes
• Use of a continuous
a-helix to dictate
geometry
Design Rules (dimers and trimers) Symmetry Construction Geometry of symmetry elements
Cages and shells
T Dimer-Trimer 54.7°, Intersecting
O Dimer-Trimer 35.3°, Intersecting
I Dimer-Trimer 20.9°, Intersecting
Double-layer rings
Dn Dimer-Dimer 180°/n, Intersecting
Two-dimensional layers
p6 Dimer-Trimer 0°, Non-intersecting
p321 Dimer-Trimer 90°, Non-intersecting
p3 Trimer-Trimer 0°, Non-intersecting
Three-dimensional crystals
I213 Dimer-Trimer 54.7°, Non-intersecting
P4132 or P4332 Dimer-Trimer 35.3°, Non-intersecting
P23 Trimer-Trimer 70.5°, Non-intersecting
Helical filaments
Helical Dimer-Dimer any angle, Non-intersecting
Tubes of indefinite length
Tubular Dimer-Dimer-Dimer N, N, N, each intersecting the
cylinder axis perpendicularly
cages
2-D
layers
3-D
crystals
filaments
and rods
Tetrahedral, T 2-fold & 3-fold
Intersecting
at 54.7°
A first, partially successful, experiment
2
3
Discrete assemblies formed, but too polymorphic
to characterize in detail (e.g. by crystallization).
Padilla, Colovos, & Yeates, PNAS, 98, 2217 (2001)
A model of the intended
assembly
• 12 subunits
• Tetrahedral symmetry
• 160 Å diameter
Trimer: bromoperoxidase Dimer: influenza matrix protein M1 9-residue linker: KALEAQKQK Geometric design requirement: symmetry axes intersecting at 54.7º.
Designed fusion: based on database search for dimer-trimer pair (ending in
helixes) that could be fused to give the required target geometry
Lys118 was mutated to alanine to avoid clash with linker
Gln24 was mutated to valine to attract the leucine on the linker
Closer inspection (11 years later) suggests that two amino acid
changes could promote the desired geometry
A first atomic structure of a designed
protein cage
• 12 subunits
• Pseudo-tetrahedral symmetry
• Partially flattened (crystal packing
and weak helical linker) Lai, Y.-T., Cascio, D. and Yeates, T.O. (2012). Science 336, 1129.
3 Å resolution
Three independent cages in two crystal forms
70 Å diameter (hypothetical) inner sphere
Lai, Y.-T., Cascio, D. and Yeates, T.O. (2012). Science 336, 1129.
Crystal structure: matches design to within 1 Å !
24 subunit protein cage (500 kDa ): a natural trimer with a dimeric
interface designed computationally according to geometric requirements
for intersection of symmetry axes.
Designed
model
King, N.P. et al. (2012). Science 336, 1171-4.
Summary
• Symmetry ideas are useful for understanding what
limits the formation of protein assemblies
• With regard to space groups, one appreciates the
origins of key patterns, and also predicts that
racemic crystallography could provide an important
long-term strategy
• Symmetry, when engineered in variational forms,
could be a powerful strategy for increasing
favorable crystallization space
• Explicitly engineering various symmetric
architectures becomes possible, with exciting
biomedical applications
Yeates Lab
Mike Thompson
Julien Jorda
Nicole Wheatley
Yen-Ting Lai
Dan McNamara
Sunny Chun
Danny Gidaniyan
David Leibly
Allan Pang
Yuxi Liu
Inna Pashkov
Neil King (former)
Duilio Cascio
Michael Sawaya
Collaborators
Stephen Kent (U. Chicago)
Thomas Bobik (Iowa State)
David Baker (Univ. Wash)
Tom Terwilliger (Los Alamos)
Geoff Waldo (Los Alamos)
Funding: NIH, NSF, DOE