Modeling the Structures of Macromolecular Assemblies Depts. of Biopharmaceutical Sciences and...
-
Upload
posy-davidson -
Category
Documents
-
view
220 -
download
0
Transcript of Modeling the Structures of Macromolecular Assemblies Depts. of Biopharmaceutical Sciences and...
Modeling the Structures of Macromolecular Assemblies
Depts. of Biopharmaceutical Sciences and Pharmaceutical ChemistryCalifornia Institute for Quantitative Biomedical Research
University of California at San Francisco
Frank Alber
Maya Topf
Andrej Šali
http://salilab.org/
UCSF
Contents
Modeling of assemblies by optimization (5’, Andrej)
Representation for modeling, illustrated by NPC (15’, Frank)
Comparative modeling (10’, Andrej)
Combined comparative modeling and density fitting (10’, Maya)
Acknowledgmentshttp://salilab.org
Yeast NPC
Tari SupraptoJulia KipperWenzhu ZhangLiesbeth VeenhoffSveta DokudovskayaMike RoutBrian Chait
Yeast NPC
Tari SupraptoJulia KipperWenzhu ZhangLiesbeth VeenhoffSveta DokudovskayaMike RoutBrian Chait
Our group/assemblies
Frank AlberMaya TopfFred DavisDmitry KorkinM.S. MadhusudhanDamien DevosMarc A. Martí-RenomMike Kim
Bino JohnNarayanan EswarUrsula PieperMin-yi Shen
Our group/assemblies
Frank AlberMaya TopfFred DavisDmitry KorkinM.S. MadhusudhanDamien DevosMarc A. Martí-RenomMike Kim
Bino JohnNarayanan EswarUrsula PieperMin-yi Shen
E. coli ribosome
H.Gao J.SenguptaM.Valle A.Korostelev S.StaggP.Van RoeyR.AgrawalS.HarveyM.ChapmanJ.Frank
E. coli ribosome
H.Gao J.SenguptaM.Valle A.Korostelev S.StaggP.Van RoeyR.AgrawalS.HarveyM.ChapmanJ.Frank
EM
Wah ChiuMatt BakerWen Jiang
Chandrajit Bajaj
EM
Wah ChiuMatt BakerWen Jiang
Chandrajit Bajaj
NIHNSFSinsheimer FoundationA. P. Sloan FoundationBurroughs-Wellcome FundMerck Genome Res. Inst.Mathers FoundationI.T. Hirschl FoundationThe Sandler Family FoundationSUNIBMIntelStructural Genomix
NIHNSFSinsheimer FoundationA. P. Sloan FoundationBurroughs-Wellcome FundMerck Genome Res. Inst.Mathers FoundationI.T. Hirschl FoundationThe Sandler Family FoundationSUNIBMIntelStructural Genomix
Structural GenomicsStephen Burley (SGX)John Kuriyan (UCB)NY-SGRC
Structural GenomicsStephen Burley (SGX)John Kuriyan (UCB)NY-SGRC
Protein and assembly structure by experiment and computation
Sali, Earnest, Glaeser, Baumeister. From words to literature in structural proteomics. Nature 422, 216-225, 2003.
Determining the structures of proteins and assemblies
Use structural information from any
source: measurement, first principles, rules,
resolution: low or high resolution
to obtain the set of all models that are consistent with it.
Modeling proteins and macromolecular assemblies by satisfaction of spatial restraints
1) Representation of a system.
2) Scoring function (spatial restraints).3) Optimization.
Sali, Earnest, Glaeser, Baumeister. Nature 422, 216-225, 2003.
There is nothing but points and restraints on them.
Assembly representation for modeling(visualization)
Multi-scale representation:Resolution of subunit structures can vary largely.
Hierarchical representation:Spatial information about subunit-subunit interactions may be available at different resolutions.
Flexibility of subunits:Subunit structures may be subject to conformational changes upon assembly formation.
Points and restraints between them (distance, angle dihedral restraints).
Software implementation
Assembler, extension of MODELLER
Provides a framework for assembly modelling:
• multi-scale hierarchical representations,
• integration of experimental input data,
• calculates models that are consistent with input data.
Resolution of restraints
Site-directedmutagenesis
Cross-linkingCryo-EM/tomographyFRET
Computational docking
1Å 300Å
Site-directedmutagenesis
Cross-linking
FRET
Binding sitepredictions
Computational docking
Correlatedmutations
Binding sitepredictions
Correlatedmutations
Site-directedmutagenesis
Cross-linking
FRET
Binding sitepredictions
Computational docking
Cryo-EM/tomography
Tandem affinity purification
FRET
Site-directedmutagenesis
Binding sitepredictions
Computational docking
Cryo-EM/tomography
Cross-linking
Immuno-EM
Yeast-two-hybrid
Tandem affinity purification
FRET
Site-directedmutagenesis
Binding sitepredictions
Computational docking
Cryo-EM/tomography
Cross-linking
Resolution of restraints
Tandem affinity purification
FRET
Site-directedmutagenesis
Immuno-EM
Yeast-two-hybrid
Binding sitepredictions
Computational docking
Cryo-EM/tomography
Cross-linking
Tandem affinity purification
FRET
Site-directedmutagenesis
Binding sitepredictions
Computational docking
Cryo-EM/tomography
Cross-linking Cryo-EM/tomography
Site-directedmutagenesis
Cross-linking
FRET
Cryo-EM/tomography
Site-directedmutagenesis
Cross-linking
FRET
Cryo-EM/tomography
Site-directedmutagenesis
Cross-linking
FRET
Binding sitepredictions
Computational docking
Cryo-EM/tomography
1Å 300Å
Binding patches
Properties of points
Define interaction centers at each structural level.
Interaction centers transmit interactions between assembly subunits.
Interaction centers:
Allow or restrict conformational degrees of freedom within assembly subunits.
Subunit flexibility:
Design alternative internal restraints sets for different conformational states.
Properties of points
Define interaction centers at each structural level.
Interaction centers transmit interactions between assembly subunits.
Interaction centers:
Modeling the Yeast Nuclear Pore complex by satisfaction of spatial
restraints
Depts. Of Biopharmaceutical Sciences and Pharmaceutical ChemistryCalifornia Institute for Quantitative Biomedical Research
University of California at San Francisco
Andrej Sali
Frank Alber, Damien DevosMike Rout, T. Suprapto, J. Kipper, L. Veenhoff,S. Dokudovskaya
Brian Chait, W. Zhang
The Rockefeller University, 1230 York Avenue, New York
http://salilab.org/
Integrate spatial information
NUPLocalization
NUPStoichiometry
NUP- NUP Interactions
NU
P S
hap
e SymmetryGlobal shape
1) 2) 3)
Successful optimization
Analysis of models
Assessing the well scoring models:How similar are the models to each other?
Do the models make sense given other data?
Search for conservation of :Protein-protein contacts
Structural features
Visualization of the results:Probability distribution of subunit positions (density maps)
Comparative modeling
Steps in Comparative Protein Structure Modeling
No
Target – TemplateAlignment
MSVIPKRLYGNCEQTSEEAIRIEDSPIV---TADLVCLKIDEIPERLVGEASILPKRLFGNCEQTSDEGLKIERTPLVPHISAQNVCLKIDDVPERLIPE
Model Building
START
ASILPKRLFGNCEQTSDEGLKIERTPLVPHISAQNVCLKIDDVPERLIPERASFQWMNDK
TARGET
Template Search
TEMPLATE
OK?
Model Evaluation
END
Yes
A. Šali, Curr. Opin. Biotech. 6, 437, 1995.R. Sánchez & A. Šali, Curr. Opin. Str. Biol. 7, 206, 1997.M. Marti et al. Ann. Rev. Biophys. Biomolec. Struct., 29, 291, 2000. http://salilab.org/
Comparative modeling by satisfaction of spatial restraints MODELLER
3D GKITFYERGFQGHCYESDC-NLQP…SEQ GKITFYERG---RCYESDCPNLQP…
1. Extract spatial restraints
F(R) = pi (fi /I)i
2. Satisfy spatial restraints
A. Šali & T. Blundell. J. Mol. Biol. 234, 779, 1993.J.P. Overington & A. Šali. Prot. Sci. 3, 1582, 1994.A. Fiser, R. Do & A. Šali, Prot. Sci., 9, 1753, 2000.
http://salilab.org/
Comparative modeling of the TrEMBL database
Unique sequences processed: 1,182,126
Sequences with fold assignments or models: 659,495 (56%)
9/15/03 ~4 weeks on 660 Pentium CPUs
70% of models based on <30% sequence identity to template.
On average, only a domain per protein is modeled(an “average” protein has 2.5 domains of 175 aa).
http://salilab.org/modbasePieper et al., Nucl. Acids Res. 2002.
Structural Genomics
Characterize most protein sequences based on related known structures.
There are ~16,000 30% seq id families (90%)(Vitkup et al. Nat. Struct. Biol. 8, 559, 2001).
Sali. Nat. Struct. Biol. 5, 1029, 1998.Sali et al. Nat. Struct. Biol., 7, 986, 2000.Sali. Nat. Struct. Biol. 7, 484, 2001.Baker & Sali. Science 294, 93, 2001.
The number of “families” is much smaller than the number of proteins.
Any one of the members of a family is fine.
Characterize most protein sequences based on related known structures.
Comparative modeling and EM density fitting
E. coli ribosomeH.Gao, J.Sengupta, M.Valle, A. Korostelev, N.Eswar, S.Stagg, P.Van Roey, R.Agrawal, S.Harvey, A.Sali, M.Chapman, and J.Frank. Cell 113, 789-801, 2003.
1. Cryo-EM density map of the 70S ribosome of E. coli was determined at ~12Å.
2. Calculate comparative models of the proteins and rRNA based on T. thermophilus, H. marismortui, and D. radiodurans ribosome structures.
3. For each protein, select the best model among several alternatives based on model assessment by statistical potentials, model compactness, target-template sequence identity, quality of template structure, and fit into the EM map.
4. Obtain an assembly model by multi-rigid body real-space refinement of the fit into the EM map, by RSRef (M. Chapman).
5. Repeat 3 and 4 iteratively to get an optimized model.
E. coli ribosome
Fitting of comparative models into 12Å cryo-electron density map.
All 29 proteins in 30S and 29 of 36 proteins in the 50S subunit could be modeled on 22-73% seq.id. to a known structure.
The models generally cover whole folds, but tails.
H.Gao, J.Sengupta, M.Valle, A. Korostelev, N.Eswar, S.Stagg, P.Van Roey, R.Agrawal, S.Harvey, A.Sali, M.Chapman, and J.Frank. Cell 113, 789-801, 2003.
Errors in comparative models vs. resolution
Distortion/shifts in
aligned regions
Region without
a template
Sidechain packingIncorrect template MisalignmentRigid body
movement
Maps courtesy of Wah Chiu.
20Å 10Å 2Å
Moulding: iterative alignment, model building, model assessment
model building
alignment
model assessment
model building
alignment
model assessment
Alignments
Mod
els
per
alig
nmen
t Comparative modeling
Threading
Moulding
1 104 1030
105
1
104
B. John, A. Sali. Nucl. Acids Res. 31, 3982, 2003.
Start
Sequence profiles of target & template
Generate 25 initial target-template alignments
Build all atom models for 15 parent alignments
Rank models by the GA341 score
Best GA341 score < 0.6?
Apply genetic algorithm operators to parents
Number of child alignments >300?
Select representative alignments
Build all-atom models for all representative alignments
Select best 10 models by statistical potential score(parents for next iteration)
Number of iterations >25?
No
No
No
Rank models by composite score
Stop
Select best model
Rank alignments by alignment score (top 15 parents)
Start
Sequence profiles of target & template
Generate 25 initial target-template alignments
Build all atom models for 15 parent alignments
Rank models by the GA341 score
Best GA341 score < 0.6?
Apply genetic algorithm operators to parents
Number of child alignments >300?
Select representative alignments
Build all-atom models for all representative alignments
Select best 10 models by statistical potential score(parents for next iteration)
Number of iterations >25?
No
No
No
Rank models by composite score
Stop
Select best model
Rank alignments by alignment score (top 15 parents)
Moulding
alignment
model building
alignment
model assessment
Application to a difficult modeling case 1BOV-1LTS (4.4% sequence identity)
1lts structure
1lts model
C RMSD 10.1Å
initial
C RMSD 3.6Å
final
Given a sequence-structure pair and an EM map: Exploring alignment (CM) and rigid body (EM)
degrees of freedom
1. Generate a set of comparative models by a standard procedure.
2. Select the model that fits the EM map best.
More generally:
Use the quality of the best fit into the EM map as one of the
model fitness criteria in MOULDER.
With Wah Chiu et al.
Moulding protocol
model building
alignment
model fitting
model assessment
helixhunter, sheehunter
PROF, PSIPRED
MODELLER
Initial alignment
fold assignment
foldhunter,
EM docking algorithm
GA341 score PROSA score
EM fitting score
MODELLER
S
E
model building
alignment
model assessment
Application: Rice Dwarf Virus (6.8Å EM)
Fold assignment: HELIXHUNTER and DEJAVU - 2 templates:
Upper domain (beta sheet) - bluetongue virus VP7 (1bvp)
Lower domain (helical) - rotavirus VP6 (1qhd)
Generate initial alignments : HELIXHUNTER and secondary structure prediction methods
MOULDING:
Iterative alignment - restrained by EM data
Model building with MODELLER, loop optimization
Model assessment (GA341 score and PROSA score)
Model fitting (FOLDHUNTER)
With Wah Chiu et al.
loop optimization: EM-derived restraints + MM forcefield
Upper domain - good fit
MODELLER-derived restraints
Lower domain - to improve fit
EM-derived restraints
best fitted structure
model fitting
END
Acknowledgmentshttp://salilab.org
Comparative ModelingBino JohnNarayanan EswarUrsula PieperMarc A. Martí-Renom
Structural GenomicsStephen Burley (SGX)John Kuriyan (UCB)NY-SGRC
NIHNSFSinsheimer FoundationA. P. Sloan FoundationBurroughs-Wellcome FundMerck Genome Res. Inst.Mathers FoundationI.T. Hirschl FoundationThe Sandler Family FoundationSUNIBMIntelStructural Genomix
RibosomesH.Gao J.SenguptaM.Valle A.Korostelev S.StaggP.Van RoeyR.AgrawalS.HarveyM.ChapmanJ.Frank
LSDPatsy Babbitt, Fred Cohen, Ken Dill, Tom Ferrin, John Irwin, Matt Jacobson, Tanja Kortemme, Tack Kuntz, Brian Shoichet, Chris Voigt
Analysis
Score
Assessing the well scoring modelsHow similar are the models to each other?
Do the models make sense given other data?
Using “toy” models as benchmarks.
Analysis
Search conservation of :Protein-protein contacts
Structural features
freq
uen
cy
NUP type
NU
P t
yp
e
E. coli ribosomeH.Gao, J.Sengupta, M.Valle, A. Korostelev, N.Eswar, S.Stagg, P.Van Roey, R.Agrawal, S.Harvey, A.Sali, M.Chapman, and J.Frank. Cell 113, 789-801, 2003.
1. …
2. Calculate comparative models of the proteins and rRNA based on T. thermophilus, H. marismortui, and D. radiodurans ribosome structures.
3. For each protein, select the best model among several alternatives based on model assessment by statistical potentials, model compactness, target-template sequence identity, quality of template structure, and fit into the EM map.
4. Obtain the final assembly model by rigid body real-space refinement of the fit into the EM map, by RSRef (M. Chapman).
5. Repeat 3 and 4.
Combining Modeller and Docking of Atomic Models into EM Density Maps (ModEM?)
Improving resolution and accuracy of molecular models determined by cryo-EM
and comparative protein structure prediction
Map courtesy of Wah Chiu.
Typical errors in comparative models
Distortion/shifts in
aligned regions
Region without a
template
Sidechain packing
Incorrect template
MODEL
X-RAY
TEMPLATE
Misalignment
Marti-Renom et al. Annu.Rev.Biophys.Biomol.Struct. 29, 291-325, 2000.