Protein Structure in 10 Points · 2005. 10. 18. · Proline isomerization Cyclophilin catalyzes Pro...

54
Globular proteins are compact and densely packed with only few empty spaces (cavities) Protein conformation and dynamics are coded in amino acid sequence A protein in its native conformation is at an energy minimum which results in spontaneous folding In soluble globular proteins, hydrophobic groups are predominantly on the inside, hydrophilic on the outside of the globule Most backbone NH and C=O groups are involved in H-bonds to other protein atoms Homology of sequence (>25-30% identity) similarity of structure There are clear statistical preferences for some amino acids in some positions in some secondary structures (but secondary structure prediction from aa-sequence can be wrong…) Loops and turns tend to be on the surface of a globular protein Proteins are dynamic molecules capable of a wide range of motions Protein structure is dominated by secondary structure Protein Structure in 10 Points

Transcript of Protein Structure in 10 Points · 2005. 10. 18. · Proline isomerization Cyclophilin catalyzes Pro...

Page 1: Protein Structure in 10 Points · 2005. 10. 18. · Proline isomerization Cyclophilin catalyzes Pro cis-trans isomerization 20% 99.9% 0.1% 80%. Conformational change - calmodulin.

• Globular proteins are compact and densely packed with only few empty spaces (cavities)

• Protein conformation and dynamics are coded in amino acid sequence• A protein in its native conformation is at an energy minimum which results in

spontaneous folding• In soluble globular proteins, hydrophobic groups are predominantly on the inside,

hydrophilic on the outside of the globule• Most backbone NH and C=O groups are involved in H-bonds to other protein atoms • Homology of sequence (>25-30% identity) similarity of structure• There are clear statistical preferences for some amino acids in some positions in

some secondary structures (but secondary structure prediction from aa-sequence can be wrong…)

• Loops and turns tend to be on the surface of a globular protein• Proteins are dynamic molecules capable of a wide range of motions • Protein structure is dominated by secondary structure

Protein Structure in 10 Points

Page 2: Protein Structure in 10 Points · 2005. 10. 18. · Proline isomerization Cyclophilin catalyzes Pro cis-trans isomerization 20% 99.9% 0.1% 80%. Conformational change - calmodulin.

Stability, Folding and Kinetics

Levinthal’s Paradox:

There are so many unfolded states that if the polypeptide chain had to search through them all in order to find the correct (minimum energy) folded state it would take longer than the age of the Universe

Page 3: Protein Structure in 10 Points · 2005. 10. 18. · Proline isomerization Cyclophilin catalyzes Pro cis-trans isomerization 20% 99.9% 0.1% 80%. Conformational change - calmodulin.

The Paradox ReferenceLevinthal, C. (1968). Are There Pathways for Protein Folding? J. Chim. Phys.

PCB 65, 44-45

Page 4: Protein Structure in 10 Points · 2005. 10. 18. · Proline isomerization Cyclophilin catalyzes Pro cis-trans isomerization 20% 99.9% 0.1% 80%. Conformational change - calmodulin.

Conformational Energy

50 100 150 200 250 30050

100

150200250300

60

61

62

63

64

65

Interactions:

Bonds, bond angles, torsions, electrostatic, van der Waals, (hydrogen bonds)

The free energy, including entropy, rules!!

Page 5: Protein Structure in 10 Points · 2005. 10. 18. · Proline isomerization Cyclophilin catalyzes Pro cis-trans isomerization 20% 99.9% 0.1% 80%. Conformational change - calmodulin.

Free Energy, Entropy and all that

Page 6: Protein Structure in 10 Points · 2005. 10. 18. · Proline isomerization Cyclophilin catalyzes Pro cis-trans isomerization 20% 99.9% 0.1% 80%. Conformational change - calmodulin.

Stability, Folding and Kinetics

Proteins are only marginally stable (∆G~10kcal/mol), and may denature if the temperature is raised above normal by a few ºC

∆G

Page 7: Protein Structure in 10 Points · 2005. 10. 18. · Proline isomerization Cyclophilin catalyzes Pro cis-trans isomerization 20% 99.9% 0.1% 80%. Conformational change - calmodulin.

The Oildrop Model of Protein Folding

Hydrophobic sidechains thus tend to be buriedinside soluble proteins – but what happens topolar groups in the backbone?

Page 8: Protein Structure in 10 Points · 2005. 10. 18. · Proline isomerization Cyclophilin catalyzes Pro cis-trans isomerization 20% 99.9% 0.1% 80%. Conformational change - calmodulin.

Stability, Folding and Kinetics

Barnase - one major pathway

Page 9: Protein Structure in 10 Points · 2005. 10. 18. · Proline isomerization Cyclophilin catalyzes Pro cis-trans isomerization 20% 99.9% 0.1% 80%. Conformational change - calmodulin.

Stability, Folding and Kinetics

Lysozyme - two different pathways.

There are two domains

Page 10: Protein Structure in 10 Points · 2005. 10. 18. · Proline isomerization Cyclophilin catalyzes Pro cis-trans isomerization 20% 99.9% 0.1% 80%. Conformational change - calmodulin.

Disulfide bridge formationBPTI

Page 11: Protein Structure in 10 Points · 2005. 10. 18. · Proline isomerization Cyclophilin catalyzes Pro cis-trans isomerization 20% 99.9% 0.1% 80%. Conformational change - calmodulin.

Proline isomerizationCyclophilin catalyzes Pro cis-trans isomerization

20%

0.1%99.9%

80%

Page 12: Protein Structure in 10 Points · 2005. 10. 18. · Proline isomerization Cyclophilin catalyzes Pro cis-trans isomerization 20% 99.9% 0.1% 80%. Conformational change - calmodulin.

Conformational change -calmodulin

Page 13: Protein Structure in 10 Points · 2005. 10. 18. · Proline isomerization Cyclophilin catalyzes Pro cis-trans isomerization 20% 99.9% 0.1% 80%. Conformational change - calmodulin.

Understanding protein folding via free-energy surfaces fromtheory and experiment

Aaron R. Dinner, Andrej Sali, Lorna J. Smith, Christopher M. Dobson and Martin Karplus TIBS 25 – JULY 2000

RG measures the size of the protein Ramachandran diagram for Alanine as ’energy’contour map

Page 14: Protein Structure in 10 Points · 2005. 10. 18. · Proline isomerization Cyclophilin catalyzes Pro cis-trans isomerization 20% 99.9% 0.1% 80%. Conformational change - calmodulin.

Model of Protein FoldingThe Model:

•a simple lattice on which the polymer is built

•favorable (native) contacts and unfavorable (non-native contacts)

The number of states can be enumerated, and the global free energy (F) minimum identified

Q0 number of native contacts

C total number of contacts

Page 15: Protein Structure in 10 Points · 2005. 10. 18. · Proline isomerization Cyclophilin catalyzes Pro cis-trans isomerization 20% 99.9% 0.1% 80%. Conformational change - calmodulin.

Model Properties – Energetic and Entropic Components

Same native structure in both cases, but in (b,d) the native contacts are not as strong

Page 16: Protein Structure in 10 Points · 2005. 10. 18. · Proline isomerization Cyclophilin catalyzes Pro cis-trans isomerization 20% 99.9% 0.1% 80%. Conformational change - calmodulin.

Fast and Slow Folding Pathways ofLysozyme

Hen lysozyme has 129 aa-residues in two domains

Sequence of events mapped out using NMR hydrogen exchange protection experiments

Page 17: Protein Structure in 10 Points · 2005. 10. 18. · Proline isomerization Cyclophilin catalyzes Pro cis-trans isomerization 20% 99.9% 0.1% 80%. Conformational change - calmodulin.

Fast and Slow Folding Pathways ofModel

This 125-mer lattice model shows very similar behavior as the experimental lysozyme

Core and surface contacts are monitored

Page 18: Protein Structure in 10 Points · 2005. 10. 18. · Proline isomerization Cyclophilin catalyzes Pro cis-trans isomerization 20% 99.9% 0.1% 80%. Conformational change - calmodulin.

Modeling a 3D Structure

• It is sometimes very difficult to obtain an experimental structure.

• Can one construct theoretical 3D models?• Today - Not really, if just based on aa sequence• Homology modeling, or comparative modeling,

works fairly well, but requires access to known structure of a similar protein

Page 19: Protein Structure in 10 Points · 2005. 10. 18. · Proline isomerization Cyclophilin catalyzes Pro cis-trans isomerization 20% 99.9% 0.1% 80%. Conformational change - calmodulin.

Bioinformatics - Concepts• Identity - Homology (“of common origin”)

• Distance Similarity• Score/Scoring matrix/z-score• Global vs Local Alignment• Multiple alignment• Dynamic Programming• Artificial Neural Networks (ANN)

Page 20: Protein Structure in 10 Points · 2005. 10. 18. · Proline isomerization Cyclophilin catalyzes Pro cis-trans isomerization 20% 99.9% 0.1% 80%. Conformational change - calmodulin.

Sequence DatabasesInternational Nucleotide Sequence Database Collaboration:• GenBank at NIH• EMBL• DDBJ (DNA DataBank of Japan)

GenBank doubles in size every 14 months!!

3 000 000 000 bases from 47 000 species (late 1999)

NCBI (National Center for Biotechnology and Information) at NIH:

http://www.ncbi.nlm.nih.gov/

Example protein sequence in FASTA format:>4LZM:_ LYSOZYME (E.C.3.2.1.17) (HIGH SALT) - CHAIN _ MNIFEMLRIDEGLRLKIYKDTEGYYTIGIGHLLTKSPSLNAAKSELDKAIGRNCNGVITK DEAEKLFNQDVDAAVRGILRNAKLKPVYDSLDAVRRCALINMVFQMGETGVAGFTNSLRM LQQKRWDEAAVNLAKSRWYNQTPNRAKRVITTFRTGTWDAYKNL

Page 21: Protein Structure in 10 Points · 2005. 10. 18. · Proline isomerization Cyclophilin catalyzes Pro cis-trans isomerization 20% 99.9% 0.1% 80%. Conformational change - calmodulin.

Identity? Similarity?Two identical sequences are easy to recognize, but tospot a relationship when they begin to differ gets progressivelymore difficult. Sequences may also be of different length - we havegaps due to insertions or deletions (indels).

s: ACACACAt: ACACACA

s: AGCACACAt: ACACACTA

s: AGCACAC-At: A-CACACTA

s: AG-CACACAt: ACACACT-A

or

Page 22: Protein Structure in 10 Points · 2005. 10. 18. · Proline isomerization Cyclophilin catalyzes Pro cis-trans isomerization 20% 99.9% 0.1% 80%. Conformational change - calmodulin.

HomologyThe biological approach:

During evolution DNA sequences (and proteins) diverge,due to point mutations and more sophisticated events.Two sequences which share a common evolutionary originare said to be homologous, and our task then is to find these relationships even for distantly releated sequences.

We may thus use our knowledge about evolution andmutations in our quest for homology.

Page 23: Protein Structure in 10 Points · 2005. 10. 18. · Proline isomerization Cyclophilin catalyzes Pro cis-trans isomerization 20% 99.9% 0.1% 80%. Conformational change - calmodulin.

ScoringDNA Protein

s:CUUCCGAAA s:Leu-Pro-Lyst:CUAGCGAGA t:Leu-Ala-Arg

We need to consider the “degree of change” in a substitution, we need a scoring scheme:

Substitution (score) matrix - there are 210 (20x19/2+20) pairs of amino acids and we need a number, a score, indicating howsimilar we consider two amino acids to be. For a given alignmentof two sequence we sum the pairwise scores and add gap penaltiesto obtain to total score for the alignment; sometimes instead the distance between two sequences is considered.

Page 24: Protein Structure in 10 Points · 2005. 10. 18. · Proline isomerization Cyclophilin catalyzes Pro cis-trans isomerization 20% 99.9% 0.1% 80%. Conformational change - calmodulin.

Specific Algorithms

• Global Alignment - find optimal alignment of two complete sequences: Needleman-Wunsch

• Local Alignment - find optimal alignment of fragments of a sequence: Smith-Waterman

• Heuristic (“less stringent”, but faster):FASTA and BLAST (Basic Local Alignment Search Tool) use local high scoring regions to find initial alignments which can be extended

Page 25: Protein Structure in 10 Points · 2005. 10. 18. · Proline isomerization Cyclophilin catalyzes Pro cis-trans isomerization 20% 99.9% 0.1% 80%. Conformational change - calmodulin.

Sequence AlignmentOverall SCP and SFCP are 80% similar

Multiple sequence alignment helps identify the real similarities

Page 26: Protein Structure in 10 Points · 2005. 10. 18. · Proline isomerization Cyclophilin catalyzes Pro cis-trans isomerization 20% 99.9% 0.1% 80%. Conformational change - calmodulin.

Structure ComparisonMeasure of structural similarity:• Root Mean Square Distance (RMSD) between equivalent, superposed atoms

• Caveats: alignment - use sequence and/or secondary structure, and superposition

• Indels also pose a problem

• Distances between Cα atom-pairsin one structure can be comparedto same distances in second structure

( )∑ =−=

N

insorientatio

iiN

RMSD1

221 )()(1min rr

High sequence similarity also gives

high structural similarity

Page 27: Protein Structure in 10 Points · 2005. 10. 18. · Proline isomerization Cyclophilin catalyzes Pro cis-trans isomerization 20% 99.9% 0.1% 80%. Conformational change - calmodulin.

Protein Folds

How many folds are there? (There are ~30000 human genes)

Current estimates 1 000-10 000We know about 100 different folds (1%-10%), almost exclusively of water soluble proteins - a handful membrane protein structures are known, even though they may account for about 1/3 of all proteins.

There are no reliable methods today of predicting a 3D structure just from an amio-acid sequence. ”Guessing” a fold based on the known structure of a homologous protein is the best we can do -homology modeling, works at >25-30% similarity

Page 28: Protein Structure in 10 Points · 2005. 10. 18. · Proline isomerization Cyclophilin catalyzes Pro cis-trans isomerization 20% 99.9% 0.1% 80%. Conformational change - calmodulin.

Protein Data Bankhttp://www.rcsb.org

Page 29: Protein Structure in 10 Points · 2005. 10. 18. · Proline isomerization Cyclophilin catalyzes Pro cis-trans isomerization 20% 99.9% 0.1% 80%. Conformational change - calmodulin.

Protein Data Bank

Page 30: Protein Structure in 10 Points · 2005. 10. 18. · Proline isomerization Cyclophilin catalyzes Pro cis-trans isomerization 20% 99.9% 0.1% 80%. Conformational change - calmodulin.

PDBSum http://www.biochem.ucl.ac.uk/bsm/pdbsum/

Page 31: Protein Structure in 10 Points · 2005. 10. 18. · Proline isomerization Cyclophilin catalyzes Pro cis-trans isomerization 20% 99.9% 0.1% 80%. Conformational change - calmodulin.

Folds in the PDB

New folds

Old folds

Page 32: Protein Structure in 10 Points · 2005. 10. 18. · Proline isomerization Cyclophilin catalyzes Pro cis-trans isomerization 20% 99.9% 0.1% 80%. Conformational change - calmodulin.

CATH http://www.biochem.ucl.ac.uk/bsm/cath/

ClassArchitectureTopologyHomologous Superfamily

Page 33: Protein Structure in 10 Points · 2005. 10. 18. · Proline isomerization Cyclophilin catalyzes Pro cis-trans isomerization 20% 99.9% 0.1% 80%. Conformational change - calmodulin.

SCOP http://scop.mrc-lmb.cam.ac.uk/scop/

Structural Classification Of Proteins

Page 34: Protein Structure in 10 Points · 2005. 10. 18. · Proline isomerization Cyclophilin catalyzes Pro cis-trans isomerization 20% 99.9% 0.1% 80%. Conformational change - calmodulin.

SCOP

Page 35: Protein Structure in 10 Points · 2005. 10. 18. · Proline isomerization Cyclophilin catalyzes Pro cis-trans isomerization 20% 99.9% 0.1% 80%. Conformational change - calmodulin.

DALI http://www.ebi.ac.uk/dali/Structure comparison server

Page 36: Protein Structure in 10 Points · 2005. 10. 18. · Proline isomerization Cyclophilin catalyzes Pro cis-trans isomerization 20% 99.9% 0.1% 80%. Conformational change - calmodulin.

Homology Modeling• Problem: Have sequence of protein and want 3D structure,

but no experimental structure is available.• Find homologous protein(s) with known 3D structure

(>25% similarity recommended!)• Align sequences (multiple alignment HELPS!)

it also helps if you have multiple templates and can use them to identify structurally conserved regions

• Identify conserved and variable regions• Generate core coordinates from template(s)• Generate conformations for loops• Build side-chain conformations• Refine and evaluate the model

Page 37: Protein Structure in 10 Points · 2005. 10. 18. · Proline isomerization Cyclophilin catalyzes Pro cis-trans isomerization 20% 99.9% 0.1% 80%. Conformational change - calmodulin.

Loops - from databases

Restricted set of CDR3 main chain conformations

Page 38: Protein Structure in 10 Points · 2005. 10. 18. · Proline isomerization Cyclophilin catalyzes Pro cis-trans isomerization 20% 99.9% 0.1% 80%. Conformational change - calmodulin.

Success rate

Page 39: Protein Structure in 10 Points · 2005. 10. 18. · Proline isomerization Cyclophilin catalyzes Pro cis-trans isomerization 20% 99.9% 0.1% 80%. Conformational change - calmodulin.

Swiss-Modelhttp://swissmodel.expasy.org/

Page 40: Protein Structure in 10 Points · 2005. 10. 18. · Proline isomerization Cyclophilin catalyzes Pro cis-trans isomerization 20% 99.9% 0.1% 80%. Conformational change - calmodulin.

Submit your own sequence, and get a 3D model back (if there are templates available…)

Page 41: Protein Structure in 10 Points · 2005. 10. 18. · Proline isomerization Cyclophilin catalyzes Pro cis-trans isomerization 20% 99.9% 0.1% 80%. Conformational change - calmodulin.

Alphavirus Spike-NC Binding

Page 42: Protein Structure in 10 Points · 2005. 10. 18. · Proline isomerization Cyclophilin catalyzes Pro cis-trans isomerization 20% 99.9% 0.1% 80%. Conformational change - calmodulin.

SFCP Model vs. Xtal

Page 43: Protein Structure in 10 Points · 2005. 10. 18. · Proline isomerization Cyclophilin catalyzes Pro cis-trans isomerization 20% 99.9% 0.1% 80%. Conformational change - calmodulin.

Conserved Residues in Hydrophobic Binding Pocket

Page 44: Protein Structure in 10 Points · 2005. 10. 18. · Proline isomerization Cyclophilin catalyzes Pro cis-trans isomerization 20% 99.9% 0.1% 80%. Conformational change - calmodulin.

Structure Validation

• Biochemically reasonable• Good stereochemistry, with main chain in

acceptable Ramachandran regions• Planar peptide bonds• Hydrogen bonding of buried residues• Apolar and polar residues properly

accommodated

Page 45: Protein Structure in 10 Points · 2005. 10. 18. · Proline isomerization Cyclophilin catalyzes Pro cis-trans isomerization 20% 99.9% 0.1% 80%. Conformational change - calmodulin.

Exceptions do existB1 fold has been changed to protein like Rop by changing 50% of the amino acids

1994 a 1000$ prize for changing fold by changing no more than 50% of aa

1997 Lynne Regan, Yale, won the prize

B1 domain of protein G

Rop dimer

Page 46: Protein Structure in 10 Points · 2005. 10. 18. · Proline isomerization Cyclophilin catalyzes Pro cis-trans isomerization 20% 99.9% 0.1% 80%. Conformational change - calmodulin.

“JANUS”

56 aa – 28 may change!

B1 and Rop have only three identical aa positions

Change key amino acids:

e g Rop Arg 16 & Asp45 form a salt bridge

No structure for Janus yet, but CD and NMR spectra indicate clear similarities to Rop, including dimer formation

Page 47: Protein Structure in 10 Points · 2005. 10. 18. · Proline isomerization Cyclophilin catalyzes Pro cis-trans isomerization 20% 99.9% 0.1% 80%. Conformational change - calmodulin.

Designing Protein/Peptide• May be easier to find aa-sequence which adopts specific

fold, than the opposite, i e to find the fold of a given sequence

• Zinc-finger peptide design:allow only certain types of aa in give regionscore Ala,Val,Leu,Ile,Phe,Tyr,Trpsurface Ala,Ser,Thr,His,Asp,Asn,Glu,Gln,Lys,Argboundary allow both core and surface setsno Pro,Cys,Met; Gly at special positions

try combinations of these, and evaluate their energy in computer

Page 48: Protein Structure in 10 Points · 2005. 10. 18. · Proline isomerization Cyclophilin catalyzes Pro cis-trans isomerization 20% 99.9% 0.1% 80%. Conformational change - calmodulin.

Designed peptideCan one design a peptide with a zinc-finger fold, without the zinc?

Real Zn-finger Designed – hydrophobic

stabilization

Page 49: Protein Structure in 10 Points · 2005. 10. 18. · Proline isomerization Cyclophilin catalyzes Pro cis-trans isomerization 20% 99.9% 0.1% 80%. Conformational change - calmodulin.

ProfilesA profile is a compilation of additional information about a sequence - it can even take into account 3D information if a 3D structure is known.

Such profiles can be used to evaluate relationships between proteins or protein families, even for distantly related proteins with little sequence similarity

Page 50: Protein Structure in 10 Points · 2005. 10. 18. · Proline isomerization Cyclophilin catalyzes Pro cis-trans isomerization 20% 99.9% 0.1% 80%. Conformational change - calmodulin.

ThreadingThis asks the inverse question of protein folding:

Given a 3D structure, which amino acid sequences are compatible with the structure?

Thread your sequence through representative set of folds and see if there is a match.

Easier, but not easy… we cannot always tell if a sequence fits a structure - our scoring or energy functions are not accurate enough

Page 51: Protein Structure in 10 Points · 2005. 10. 18. · Proline isomerization Cyclophilin catalyzes Pro cis-trans isomerization 20% 99.9% 0.1% 80%. Conformational change - calmodulin.

Structural GenomicsSimilar Sequence Similar Structure

The Protein Universewith protein “families”

RMSD

?

Known structures

Page 52: Protein Structure in 10 Points · 2005. 10. 18. · Proline isomerization Cyclophilin catalyzes Pro cis-trans isomerization 20% 99.9% 0.1% 80%. Conformational change - calmodulin.

Structural Genomics

Page 53: Protein Structure in 10 Points · 2005. 10. 18. · Proline isomerization Cyclophilin catalyzes Pro cis-trans isomerization 20% 99.9% 0.1% 80%. Conformational change - calmodulin.

Structural GenomicsWhich proteins?

Page 54: Protein Structure in 10 Points · 2005. 10. 18. · Proline isomerization Cyclophilin catalyzes Pro cis-trans isomerization 20% 99.9% 0.1% 80%. Conformational change - calmodulin.

Lab 2 goals

• Measure conformational details (distances, angles) in a protein

• Ramachandran plots• Extract information from coordinate file header• Investigate disulfide bonds• RasMol scripts