CS790 – BioinformaticsProtein Structure and Function1 Disulfide Bonds Two cyteines in close...
-
Upload
maud-strickland -
Category
Documents
-
view
231 -
download
8
Transcript of CS790 – BioinformaticsProtein Structure and Function1 Disulfide Bonds Two cyteines in close...
Protein Structure and Function 1CS790 – Bioinformatics
Disulfide BondsDisulfide Bonds Two cyteines in
close proximity will form a covalent bond
Disulfide bond, disulfide bridge, or dicysteine bond.
Significantly stabilizes tertiary structure.
Protein Structure and Function 2CS790 – Bioinformatics
Determining Protein StructureDetermining Protein Structure There are O(100,000) distinct proteins in the
human proteome. 3D structures have been determined for 14,000
proteins, from all organisms• Includes duplicates with different ligands bound,
etc.
Coordinates are determined by X-ray X-ray crystallographycrystallography
Protein Structure and Function 3CS790 – Bioinformatics
X-Ray CrystallographyX-Ray Crystallography
~0.5mm
• The crystal is a mosaic of millions of copies of the protein.
• As much as 70% is solvent (water)!
• May take months (and a “green” thumb) to grow.
Protein Structure and Function 4CS790 – Bioinformatics
X-Ray diffractionX-Ray diffraction
Image is averagedover:• Space (many copies)• Time (of the diffraction
experiment)
Protein Structure and Function 5CS790 – Bioinformatics
Electron Density MapsElectron Density Maps Resolution is
dependent on the quality/regularity of the crystal
R-factor is a measure of “leftover” electron density
Solvent fitting Refinement
Protein Structure and Function 6CS790 – Bioinformatics
The Protein Data BankThe Protein Data Bank
ATOM 1 N ALA E 1 22.382 47.782 112.975 1.00 24.09 3APR 213ATOM 2 CA ALA E 1 22.957 47.648 111.613 1.00 22.40 3APR 214ATOM 3 C ALA E 1 23.572 46.251 111.545 1.00 21.32 3APR 215ATOM 4 O ALA E 1 23.948 45.688 112.603 1.00 21.54 3APR 216ATOM 5 CB ALA E 1 23.932 48.787 111.380 1.00 22.79 3APR 217ATOM 6 N GLY E 2 23.656 45.723 110.336 1.00 19.17 3APR 218ATOM 7 CA GLY E 2 24.216 44.393 110.087 1.00 17.35 3APR 219ATOM 8 C GLY E 2 25.653 44.308 110.579 1.00 16.49 3APR 220ATOM 9 O GLY E 2 26.258 45.296 110.994 1.00 15.35 3APR 221ATOM 10 N VAL E 3 26.213 43.110 110.521 1.00 16.21 3APR 222ATOM 11 CA VAL E 3 27.594 42.879 110.975 1.00 16.02 3APR 223ATOM 12 C VAL E 3 28.569 43.613 110.055 1.00 15.69 3APR 224ATOM 13 O VAL E 3 28.429 43.444 108.822 1.00 16.43 3APR 225ATOM 14 CB VAL E 3 27.834 41.363 110.979 1.00 16.66 3APR 226ATOM 15 CG1 VAL E 3 29.259 41.013 111.404 1.00 17.35 3APR 227ATOM 16 CG2 VAL E 3 26.811 40.649 111.850 1.00 17.03 3APR 228
http://www.rcsb.org/pdb/
Protein Structure and Function 7CS790 – Bioinformatics
Practical Assignment #1Practical Assignment #1 Get entry 2APR from the PDB. This is an
Aspartic Protease structure. Download Rasmol or Raswin and load 2APR. Render the molecule as sticks with CPK
coloring and print the image. Render the molecule as either a ribbons or
cartoon image, showing secondary structure. Rotate the molecule to show at least one beta
sheet and one alpha helix. Print this image and turn it in as well.
Protein Structure and Function 8CS790 – Bioinformatics
The Protein Folding ProblemThe Protein Folding Problem Central question of molecular biology:
“Given a particular sequence of amino acid Given a particular sequence of amino acid residues (primary structure), what will the residues (primary structure), what will the tertiary/quaternary structure of the resulting tertiary/quaternary structure of the resulting protein be?”protein be?”
Input: AAVIKYGCAL…Output: 11, 22…= backbone conformation:(no side chains yet)
Protein Structure and Function 9CS790 – Bioinformatics
Protein Folding – Biological perspectiveProtein Folding – Biological perspective Central dogma: Central dogma: Sequence specifies structureSequence specifies structure Denature – to “unfold” a protein back to
random coil configuration-mercaptoethanol – breaks disulfide bonds• Urea or guanidine hydrochloride – denaturant
Anfinsen’s experiments• Denatured ribonuclease• Spontaneously refolded into enzymatically active
form Verified for numerous proteins
Protein Structure and Function 10CS790 – Bioinformatics
Folding intermediatesFolding intermediates Levinthal’s paradox – Consider a 100 residue
protein. If each residue can take only 3 positions, there are 3100 = 5 1047 possible conformations.• If it takes 10-13s to convert from 1 structure to
another, exhaustive search would take 1.6 1027 years!
Folding must proceed by progressive stabilization of intermediates• Molten globules – most secondary structure formed,
but much less compact than “native” conformation.
Protein Structure and Function 11CS790 – Bioinformatics
Ideas on protein foldingIdeas on protein folding It is believed that hydrophobic collapse is a key
driving force for protein folding• Hydrophobic core!
Proteins are, in fact, only marginally stable• Native state is typically only 5 to 10 kcal/mole more
stable than the unfolded form Many proteins help in folding
• Protein disulfide isomerase – catalyzes shuffling of disulfide bonds
• Chaperones – break up aggregates and (in theory) unfold misfolded proteins
Protein Structure and Function 12CS790 – Bioinformatics
The Hydrophobic CoreThe Hydrophobic Core Hemoglobin A is the protein in red blood cells
(erythrocytes) responsible for binding oxygen. The mutation E6V in the chain places a
hydrophobic Val on the surface of hemoglobin The resulting “sticky patch” causes hemoglobin
S to agglutinate (stick together) and form fibers which deform the red blood cell and do not carry oxygen efficiently
Sickle cell anemia was the first identified molecular disease
Protein Structure and Function 13CS790 – Bioinformatics
Sickle Cell AnemiaSickle Cell Anemia
Sequestering hydrophobic residues in Sequestering hydrophobic residues in the protein core protects proteins from the protein core protects proteins from hydrophobic agglutination.hydrophobic agglutination.
Protein Structure and Function 14CS790 – Bioinformatics
Computational Protein FoldingComputational Protein Folding Two key questions:
• Evaluation – how can we tell a correctly-folded protein from an incorrectly folded protein?
H-bonds Electrostatics Hydrophobic exposure Etc.
• Optimization – once we get an evaluation function, can we optimize it?
Simulated annealing EC Etc.
Protein Structure and Function 15CS790 – Bioinformatics
Evaluation of Protein FoldsEvaluation of Protein Folds Empirical potential functions
• Residue-based: spatial relationships among residues
• Stereochemistry-based: molecular interactions (covalent, electrostatic, etc.) with coefficients
Ab-initio potential functions Procheck, etc. Full molecular dynamics
• Very computationally expensive
Protein Structure and Function 16CS790 – Bioinformatics
Threading: Fold recognitionThreading: Fold recognition Given:
• Sequence: IVACIVSTEYDVMKAAR…
• A database of molecular coordinates
Map the sequence onto each fold
Evaluate• Objective 1: improve
scoring function• Objective 2: folding
Protein Structure and Function 17CS790 – Bioinformatics
Fold OptimizationFold Optimization Simple lattice models (HP-
models)• Two types of residues:
hydrophobic and polar• 2-D or 3-D lattice• The only force is hydrophobic
collapse• Score = number of HH
contacts
Protein Structure and Function 18CS790 – Bioinformatics
The “hydrophobic zipper” effect:
Learning from Lattice ModelsLearning from Lattice Models
Ken Dill ~ 1997
Protein Structure and Function 19CS790 – Bioinformatics
Secondary Structure PredictionSecondary Structure Prediction Easier than folding
• Current algorithms can prediction secondary structure with 70-80% accuracy
Chou, P.Y. & Fasman, G.D. (1974). Biochemistry, 13, 211-222.
• Based on frequencies of occurrence of residues in helices and sheets
PhD – Neural network based• Uses a multiple sequence alignment• Rost & Sander, Proteins, 1994 , 19, 55-72
Protein Structure and Function 20CS790 – Bioinformatics
Secondary Structure PredictionSecondary Structure Prediction
AGVGTVPMTAYGNDIQYYGQVT…AGVGTVPMTAYGNDIQYYGQVT…A-VGIVPM-AYGQDIQY-GQVT…AG-GIIP--AYGNELQ--GQVT…AGVCTVPMTA---ELQYYG--T…
AGVGTVPMTAYGNDIQYYGQVT…AGVGTVPMTAYGNDIQYYGQVT…----hhhHHHHHHhhh--eeEE…----hhhHHHHHHhhh--eeEE…
Protein Structure and Function 21CS790 – Bioinformatics
A Peek at Protein FunctionA Peek at Protein Function Serine proteases – cleave other proteins
• Catalytic Triad: ASP, HIS, SER
Protein Structure and Function 22CS790 – Bioinformatics
Three Serine ProteasesThree Serine Proteases Chymotrypsin – Cleaves the peptide bond on
the carboxyl side of aromatic (ring) residues: Trp, Phe, Tyr; and large hydrophobic residues: Met.
Trypsin – Cleaves after Lys (K) or Arg (R)• Positive charge
Elastase – Cleaves after small residues: Gly, Ala, Ser, Cys
Protein Structure and Function 23CS790 – Bioinformatics
Specificity Binding PocketSpecificity Binding Pocket
Protein Structure and Function 24CS790 – Bioinformatics
onwardonward Apo-proteins and prosthetic groups Lab techniques for proteins
• Gels• Xtal• Digests
Some computational areas of interest• Folding• Docking, screening