Post on 15-Jan-2016
Perspectives of Structure-Sequence Dependent Stability of Collagen and Interaction of
Polyphenol Molecules with Collagen
V. Subramanian
Chemical Laboratory
Central Leather Research Institute Chennai
Introduction
Collagen is an extremely important protein, which provides mechanical strength and structural integrity to connective tissues
Nineteen different collagen types identified till date The identifying motif of the collagen is triple helix Prof Ramachandran and co-workers and Rich and
Crick The Gly-X-Y is the general repeating sequence of
the collagen (33% of Gly) Mutations in collagen chain can render the fibrils
unstable
Triple Helix
The collagen triple helix constitutes the major motif in fibril forming collagen and also occur as a domain in non-fibrillar collagens
Hydrogen bonding and presence of high content of imino acids provide stability to the three dimensional structure of collagen
The role of water mediated hydrogen bonding and hydration also play a crucial role in the stability of collagen
Recent experimental studies revealed that the presence of Arg in the Y position provides equal stability when compared to Gly-Pro-Hyp
The destabilizing nature of Asp in the Y position is also evident from the experimental studies
Therefore assessment of propensity of various amino acids to form collagen like peptides is an important area of research activity
Collagen Structure: An Indian Origin
Single Vs two hydrogen bond(s) If X and Y positions are imino acids, there is no
possibility of forming two hydrogen bonds The incorporation of other amino acids
provides a possibility of readdressing this question
StabilizingInter-strand
H-bonds
Collagen Triple Helix
Propensity of Various Amino Acids to Form Collagen Like Motif: Guest Host Approach
Propensity of various amino acids to form alpha helix and beta sheet have been addressed
Host-Guest peptide approach has been used to estimate the propensity
The presence of various amino acids not only influences the three dimensional structure but also the stability of collagen
The amino acid propensity-stability-function is an important area of research in molecular biophysics
Several experimental studies have been carried out on model collagen-like peptides to establish the propensity of various amino acids to form collagen
Circular dichrosim Spectroscopy has been used to develop triple helix propensity of various amino acids
The molar elipticity was monitored at 225nm while sample temperature was increased from 0 to 800 C
The melting curves were used to calculate fraction of folded states
Fraction folded has been used to compute vant Hoff enthalpies and free energy
These information provides experimental basis for predicting relative stabilities of various amino acids to form collagen like structure
Propensity scales will be used to compare the results obtained from modeling and simulations
Triple Helix Propensity Scale
)(
)(13)(
320
TF
TFcTK
4
3ln1exp)(
20
0 c
T
T
RT
HTK
m
Unraveling the Stability of Collagen: Experimental Studies by Brodsky and Co-workers
Host-Guest approach has been used to introduce new sequences in the general repeating Gly-Pro-Hyp sequences
Parameters such as melting temperature thermodynamics parameters from melting studies G of stabilization
of the host-guest collagen-like peptides have been studied to identify the influence of amino-acids towards the stability of collagen
Thermodynamics parameters for the Guest Host peptides
Biochemistry, 1996, 32, 10262.
Propensity data from Brodsky work
Biochemistry, 2000, 39, 14960.
Melting Temperature & Thermodynamic parameters for the Guest Host peptides in Y position
Melting Temperature & Thermodynamic parameters for the Guest Host peptides in X position
Collagen in Diseases
Mutation in collagen genes COL1A1 and COL1A2 leads to Osteogenesis Imperfecta (OI), a brittle bone disease
A point mutation in one of types collagen genes can cause disease
One of the main cause for OI is GlyAla mutation Glycine substitutions to another amino acid more severe
than mutations of X or Y in Gly - X - Y triplet. Understanding the stability of collagen upon mutation
becomes necessary Since collagen is a large protein, it is difficult to study the
influence of amino acids Various attempts have been made to probe the effect of
mutation in model collagen-like peptide sequences
Collagen in Diseases
Mutations in collagen leads to Osteogenesis Imperfecta (Type –I), Chondrodysplasis (type II), Ehlers-Danlos syndrome (type III), Alport syndrome (type IV), Bethlem myopathy (type VI) etc
Mutation in collagen genes COL1A1 and COL1A2 leads to Osteogenesis Imperfecta (OI), a brittle bone disease
A point mutation in one of type I collagen genes can cause disease
One of the main causes for OI is GlyAla mutation Glycine substitutions to another amino acid is more
severe than mutations of X or Y in Gly - X - Y triplet
Understanding the stability of collagen upon mutation becomes necessary
Collagen mimics and Biomaterial applications
Various physical and chemical properties make collagen as a versatile material for biomaterial applications
Studies on Collagen mimetics have been made to understand the strength of triple helix and for their application in biomaterials
In collagen mimetics, a variety of unnatural amino acids are incorporated in X and Y positions
K. N. Ganesh* and his coworkers have used 4 amino proline containing collagen like sequences
Murray Goodmann$ and his group made an attempt to template assembling of collagen like peptides using conformationally constrained organic molecule
*JACS, 1996, 118, 5156$JACS, 2001, 123, 2079
Frequency of Occurrence of Selected Triplets in Collagen
Propensity of Various Amino Acids to Form Collagen Like Motif
The propensity of various amino acids to form alpha helix and beta sheet have already been established
Host-Guest peptide approach has been used to estimate the propensity
The presence of various amino acids not only influences the three dimensional structure but also the stability of collagen
The amino acid propensity-stability-function is an important area of research in molecular biophysics
Several experimental studies have been carried out on model collagen-like peptides to establish the propensity of various amino acids to form collagen
Issues addressed
To determine the stability of collagen upon substitution of Gly-Pro-Hyp by other collagen-like triplets
To develop the propensity scale for various amino acids to form collagen-like peptides based on free energy of mutation
To probe the interaction between model collagen like peptides with polyphenols
Methodology
Ab initio and DFT calculations have been performed on collagen like triplets in both collagen and extended conformation
Free energy of solvation for these triplets have been computed for both conformations using Polarizable Continuum Method
Free energy of solvation has been used to compute the stability and amino acid propensity
The stability of these peptides have also been analysed by calculation of hardness
Free energy of various triplets have also been computed using classical molecular dynamics simulations
Using these values free energy change has been quantified
Model Collagen Triplets for Ab initio and DFT calculations
Gly-Pro-Hyp collagen-like conformation
Gly-Pro-Hyp Extended conformation
Superimposed structures of Gly-Pro-Hyp in both conformations
Relative Energy of Proline Containing Triplets
Relative Energy of Hyp Containing Triplets
Relative Energy of Triplets without Imino acids
Important Observations
The triplets containing proline or hydroxy proline are more stable in collagen-like conformation
Proline sterically restricts the N-C rotation and it has limited values of , – 63 ±15 degrees
Hence, proline can not be found in other known major protein motif
The dihedral angle corresponding to conformational energy minima for proline has been found to be –75 and 145o (, )
It can stabilize secondary structure of protein only when the allowed values of all other amino acids coincide with that of proline
Important Observations ………Contnd.
It is evident from the relative energy that Gly-Gly-Hyp does not stable in collagen like conformation
Recent experimental evidence confirms that glycine in the second position destabilizes the collagen triple helix
Solvation drastically alters the relative energy Proper ordering of the stability of various triplets needs
geometry optimization in solvent media
Free Energy Solvation
Important Observations
Solvation free energy of collagen-like sequences indicates that the triplets in collagen-like conformation can be hydrated better than its extended counterpart
The presence of polar and non-polar residues in the sequence drastically influences the solvation
Specifically Arg either in second or third position influences the solvation
Arg stabilizes the collagen-like sequence similar to stability provided by Hyp in the Y position
Free Energy Cycle
1
3 4
2
Assessment of Stability Using G
The propensity to form collagen-like sequence has been calculated using Gly-Pro-Hyp as reference
The calculated G ranges from 0.0 to 15.8 kcal/mol
The change in the free energy of Gly-Pro-Pro and Gly-Pro-Flp is close to Gly-Pro-Hyp
The most stable sequence is Gly-Pro-Hyp The general trend correlates well with the
experimental values derived from melting temperature studies on model systems
Triplets Involved in the Stability of Collagen: A Propensity Scale
-helix -turn
-sheet
A Propensity Scale Collagen
Chemical Hardness
Global hardness of various triplets in collagen-like and extended conformation has been calculated
It interesting to note that the chemical hardness values are more for the triplets in collagen-like conformation than extended conformation
Experimentally, Asp in the triplet does not favor collagen folding
Chemical hardness for Gly-Pro-Asp is observed to be less compared to the other sequences
Important Observations
B3LYP/6-31 G* level of theory predicted that collagen-like conformation of the Gly-Pro-Hyp is stable than the extended conformation by 0.46 kJ/mol
Hardness of triplets of sequences Gly-X-Y (without Hyp and Pro) is lower than the triplets containing Pro and Hyp residues
Emerging Roles of Computational Techniques in Tanning Theory
Computer model of bovine type I collagen has been simulated
Early report of molecular modeling of tanning processes has been made
Model peptides for collagen has been selected and interactions with various tanning materials simulated using force field as well as Density Functional Theoretical methods
Binding energies for various interactions of collagen like peptide with tannin molecules have been estimated
Computer Simulation of Collagen –like Peptide-Tannin Interaction
Collagen -Catechin Collagen -Epicatechin Collagen –Gallic Acid
Collagen -Quercetin
Interaction of gallic acid collagen like peptides
Gallic acid is a good anti-oxidant present in many plant sources
Gallic acid has been shown to selectively induce cell death in cancerous cell lines by binding to specific receptors or enzymes
Gallic acid finds major role in stabilization of collagen in tanning process of leather making
Collagen is an important and abundant connective tissue protein in animal kingdom
Gallic Acid
Collagen assemblies are stabilized by covalent and non-covalent interactions
A fundamental understanding on the interaction of gallic acid with collagen is important to unravel the nature of interactions that are required for the stabilization of collagen matrix
Theoretical calculations can be used for the determination and quantification of such interactions
In this view present investigation focuses on determining the interactions of different dipeptides with gallic acid
Such a study can not only be correlated to stabilization process involved in collagen but also will lead to the advancement on the knowledge of peptide-ligand interaction
Computational details
Three classical dipeptides of amino acids glutamic acid, lysine and serine chosen for the interaction studies with gallic acid
Dipeptides imposed with the and corresponding to the angles of collagen
Dipeptides and gallic acid built and energy minimized using modules available in Insight II(MSI, USA)
Four functional sites namely 3 OH groups and one COOH group present in the gallic acid identified to have the potential to interact with the side chain groups of the dipeptide
The geometry of the complexes optimized by a semi-empirical PM3 method using Gaussian98 suite of programs
Energy of the complex calculated using both Hartree Fock (HF) and DFT based B3LYP methods using 3-21G* & 6-31G* basis sets employing Gaussian 98w suite of programs
The interaction energy (VINT) calculated using supermolecule approach
VINT = TEcomplex – [ TEdipeptide + TEgallic acid]
Binding Energy (VBE) is, VBE = - VINT
Molecular electrostatic potentials (MESP) are useful in understanding the weak and non-covalent interactions. The electrostatic potential V(r) is defined as
ZA is the charge on nucleus A located at RA, and (r') is the electron density at a point r
MESP features of peptide-gallic acid complex have been studied by BLYP/DN using DMOL implemented in Cerius2
Molecular Dynamic (MD) calculations have been done for one of the complexes, to see the time evolution of the hydrogen-bonded complex. A time step of 1fs has been chosen and the MD simulations have been performed for 600 ps including an equilibration period of 100 ps
'
')'()(
rr
drr
Rr
ZrV
A
A
Discussion The functional groups para-OH, two meta-OH and COOH of
gallic acid have been assumed to act as a hydrogen bond donor/acceptor for different side chain groups of amino acids in dipeptide
Most of the complexes have exhibited hydrogen bonding in the complexes consisting of dipeptides-gallic acid
Complexes have exhibited binding energies in the range of 4 – 18 kcal/mol
Complexes of glutamic acid dipeptide with gallic acid have all exhibited hydrogen bonding with high binding energies
Some of the complexes of gallic acid with serine and lysine dipeptide have also exhibited hydrogen bonding
The interaction with COOH group of gallic and side chain COOH group of glutamic acid exhibited the maximum binding energy
All complexes calculated by HF methods predicted lower binding energies when compared to the binding energies predicted from DFT methods
Molecular electrostatic potential estimation of various complexes provided clues on the involvement of the electrostatics involved in the interaction process
Dipeptide
Binding energies of gallic acid – dipeptide complex (kcal/mol)
m1 – OH (C1) p – OH (C2) m2 – OH (C3) COOH (C4)
3-21G*
6-31G*
3-21G*
6-31G*
3-21G*
6-31G*
3-21G*
6-31G*
Glutamic Acid
10.29 8.08 7.06 5.06 11.19 8.98 18.18 15.1
Lysine 5.51 3.75 11.65 9.51 8.97 6.53 14.51 12.18
Serine 7.7 6.32 10.96 10 6.23 5.1 12.92 10.19
Interaction energies of different sites of gallic acid with different dipeptides calculated using Density Functional Theory (B3LYP) with basis set 3-21G* and 6-31G*
Dipeptide
Binding energies of gallic acid dipeptide complex (kcal/mol)
m1 – OH (C1) p – OH (C2) m2 – OH (C3) COOH (C4)
3-21G*
6-31G*
3-21G*
6-31G*
3-21G*
6-31G*
3-21G*
6-31G*
Glutamic Acid
8.67 6.0 6.12 3.85 9.3 6.7 16.86 13.39
Lysine 3.55 –3.25 11.57 9.15 7.95 5.22 12.6 10.33
Serine 7.28 5.91 10.65 9.34 6.02 4.39 12.59 9.46
Interaction energies of different sites of gallic acid with different dipeptides calculated using Hartree Fock (HF) method with basis set 3-21G* and 6-31G*
Hydrogen Bonded Complexes of Glutamic acid Dipeptide and Gallic acid
C4C3C2
C4C2
Hydrogen Bonded Complex of Serine Dipeptide and Gallic acid
C2
Hydrogen Bonded Complexes of Lysine Dipeptide and Gallic acid
-ve MESP of serine gallic complex (C1)
-ve MESP of glutamic-gallic complex (C4) -ve MESP of lysine gallic complex (C2)
The functional groups para-OH, two meta-OH and COOH of gallic acid have been assumed to act as a hydrogen bond donor/acceptor for different side chain groups of amino acids in dipeptide
Most of the complexes have exhibited hydrogen bonding in the complexes consisting of dipeptides-gallic acid
Complexes have exhibited binding energies in the range of 4 – 18 kcal/mol
Complexes of glutamic acid dipeptide with gallic acid have all exhibited hydrogen bonding with high binding energies
Collagen-like Peptide Sequence
Difficult to handle large systems like collagen molecule for molecular simulation calculations
Interaction studies can be carried by building collagen like peptide sequence maintaining the uniqueness of collagen
A 9-mer sequence Ace-Gly-Pro-Hyp-Gly-Ala-Ser-Gly-Glu-Arg-Nme is built based on the repeatability of the sequences and on the presence of amino acids in the actual collagen molecule by imposing and constraints based on G N Ramachandran plot
Peptide sequence and polyphenolic molecules minimized using CVFF(Consistence Valence Force Field)
The polyphenolic molecules placed near the different sites of the peptide sequence and minimized
The binding energy of the molecules with the peptide sequence calculated based on the equation,
EB - Binding Energy (kcal/mol)
Epolyphenolics = Total energy of the minimized structure of polyphenolic molecules (kcal/mol)
Esequence = Total energy of the minimized structure of collagen-like peptide sequence (kcal/mol)
Interaction of Polyphenolics with Collagen-like Peptide Sequence
complexsequenceicspolyphenolB EEEE
Interaction of Polyphenolics with Collagen-like Peptide Sequence
Hydrogen bonded complex of Gallic acid with Serine residue of collagen like peptide (Binding Energy = 8 kcal/mol )
Hydrogen bonded Complex of Catechin with Peptide (Binding Energy = 18 kcal/mol )
Complex of Quercetin with collagen like peptide (Binding Energy = 12 kcal/mol )
Molecular Electrostatic Potential Surface (MESP) of Gallic acid–Collagen-like Peptide Complex
Positive electrostatic potential (0.7) surface of the complex of gallic acid with Glutamic acid residue of the peptide sequence
Negative electrostatic potential (-0.01) surface of the complex of gallic acid with Glutamic acid residue of the peptide sequence
Binding Energies of Polyphenol-Collagen-like Peptide Complexes
Complexes
Binding Energies (kcal/mol)
Catechin Quercetin Gallic acid
1 19.10.2 13.70.2 8.20.1
2 16.40.1 16.20.2 7.10.1
3 15.60.2 12.20.1 6.10.2
[1]– Polyphenolic molecule interacted around the serine and glutamic acid residue of the collagen-like peptide sequence.[2]– Polyphenolic molecule interacted around the arginine residue of the model collagen-like peptide sequence.[3] –Polyphenolic molecule interacted around the hydroxyproline residue of the model collagen-like peptide sequence.
Lessons from Molecular Modeling Studies
Molecular modeling studies have provided a basis to identify the interaction process involved in tanning
Catechin exhibited stronger binding, as compared to other polyphenolics chosen for the study
Many of the complexes exhibited hydrogen bonding and some exhibited electrostatic and weak interactions
MESP has revealed a lock and key type of electrostatic interactions involved in the stabilization of gallic acid and collagen-like peptide complex
Geometrical Issues in binding small molecules by collagen; A Prospective
Analysis
Computational Details
Four representative polyphenol molecules viz., gallic acid, catechin, epigallocatechingallate and pentagalloylglucose chosen for interaction studies
24-mer collagen triple helix corresponding to residues 193 to 216 (21 and 12 chains) of the native Type I collagen is constructed using the GENCOLLAGEN package
Following is the amino acid sequence of triple helix,
[Gly-Glu-Hyp-Gly-Pro-Hyp-Gly-Pro-Ala-Gly-Ala-Lys-Gly-Pro-Ala-Gly-Asn-Hyp-Gly-Ala-Asp-Gly-Gln-Hyp] 1
[Gly-Glu-Val-Gly-Leu-Hyp-Gly-Leu-Ser-Gly-Pro-Val-Gly-Pro-Hyp-Gly-Asn-Ala-Gly-Pro-Asn-Gly-Leu-Hyp] 2
The 24-mer triple helix and polyphenols are minimized using CVFF with a dielectric constant of 4.0
Collagen - an inside out proteinSide chain hydroxyl group of the amino
acids, serine and hydroxyproline, carboxyl group of aspartic acid, amino group of lysine and amide group of aspargine are potential interacting sites for formation of hydrogen bonds with polyphenols
Energy minimized structures of polyphenols
Vegetable Tannins
Catechin
Gallic acid
Epigallo Catechin Gallate
Penta galloylglucose
Energy minimized structure of 24-mer collagen triple helix
Complex between aspargine of T.Helix and gallic acid
Complex between aspartic acid of T.Helix and catechin
Complex between lysine of T.Helix and epigallocatechingallate
Complex between aspargine of T.Helix and pentagalloylglucose
Binding Sites in triple helix
Binding Energy (Kcal/mol)
Gallic acid (Gal)
Catechin (Cat)Epigallocatechi
ngallate (EGCG)
Pentagalloyl glucose (PGG)
9th residue Ser of C-chain (α2)
16.5 22.5 35.2 56.6
6th residue Hyp of A-chain (α1)
14.5 20.8 34.5 48.4
12th residue Lys of B-chain (α1)
19.2 23.8 37.9 41.1
21st residue Asp of A-chain (α1)
18.4 20.0 38.2 59.8
17th residue Asn of C-chain (α2)
14.1 23.7 34.3 52.8
Binding energies different complexes between polyphenols and triple helix
Interaction Site
Gallic acid (Gal) Pentagalloylglucose (PGG)
H-bondBond Dist Å
Bond Angle
H-bondBond Dist Å
Bond angle
9th residue Ser of C-chain (α2)
SerC9-(Cα)-C-O…H(3)O-Gal- 3.02 141
HypC15-(Cα)-O-H…O(19)-PGG
AsnA17-N-H…O(19)-PGGAlaA15C=O…H(20)-PGG
SerC9-(Cα)-C-O…H(10)O- PGG
2.843.172.762.93
177156156121
6th residue Hyp of A-chain (α1)
AspB21C=O…H(3)O-Gal 2.97 138
GluB2-(Cα)-C-O-H…O(15)-PGG
GluB2-(Cα)-C-O…H(24)-PGG
HypB3C=O…H(23)-PGGHypB3C=O…H(18)-PGGLeuC5-N-H…O(12)-PGG
3.042.882.963.083.17
163139131157149
12th residue Lys of B-chain (α1)
LysB12-(Cα )N-H…O(2)- Gal 3.28 126HypA18-(Cα)-O-H…O(3)-PGG
AsnB17-(Cα)-N-H…O(2)O-PGG3.093.12
122141
21st residue Asp of A-chain (α1)
AspB21-(Cα)-O-H…O(3)-Gal
HypB18-C=O…H(6)O-Gal2.892.91
128147
AspA21-N-H…O(9)-PGGGlyA19C=O…H(13)O-PGG
2.962.83
164174
17th residue Asn of C-chain (α2)
AsnA17-(Cα)-C=O…H(3)O-Gal 2.84 151AsnB17-(Cα)-C=O…H(8)O-PGG
HypA18C=O…H(4)O-PGGGlyA16N…H(3)O-PGG
3.272.9
3.42
159140161
Hydrogen bonds of complexes; their length and angle
Interaction Site
Catechin (Cat) Epigallocatechingallate (EGCG)
H-bondBonddist Å
BondAngle
H-bondBond dist Å
Bondangle
9th residue Ser of C-chain (α2)
SerC9-(Cα)-O-H…O(2)-Cat 3.04 161LysA12C=O…H(9)O- EGCGSerC9-C=O…H(3)O- EGCG
2.822.79
148132
6th residue Hyp of A-chain (α1)
HypA6-(Cα)-O-H…O(1)-Cat
AlaB9-N…H(12)O-Cat3.023.18
127137
ProB8-N… H(13)O- EGCG 3.25 142
12th residue Lys of B-chain (α1)
LysB12C=O…H(11)O-Cat 3.1 126 LysB12-(Cα)-N-H…O(2)- EGCG 3.41 150
21st residue Asp of A-chain (α1)
AspA21-(Cα)-C-O-H…O(4)-Cat
AlaB20-NH…O(2)-CatGlnA23-(Cα)-N-H…O(6)-Cat
3.083.223.24
150133164
AspA21-N-H…O(2)-EGCGGlnA23-(Cα)-N-H…O(6)-EGCG
3.33.26
147146
17th residue Asn of C-chain (α2)
GlyA16C=O…H(14)O-CatAsnA17-(Cα)-C=O…H(11)O-Cat
3.002.92
146151
GlyA16C=O…H(12)O- EGCGHypA18C=O…H(9)O- EGCGHypA18C=O…H(3)O- EGCG
AlaA20-N-H…O(2)- EGCG
2.992.822.9
3.13
140162156143
Total and contact surface areas of the collagen like triple helix and polyphenols in Å2
Collagen(24-mer)
Cat EGCG PGG Gal
CSA 1164 120 163 275 84
TSA 3825 268 382 688 160CSA – Contact surface
areaTSA – Total Surface Area
Gal Cat EGCG PGG
AT BT AT BT AT BT AT BT
Ser 92 61 110 78 219 124 462 205
Hyp 85 65 120 75 248 115 421 197
Lys 151 82 176 94 279 135 357 189
Asp 102 71 112 76 186 115 514 238
Asn 84 69 124 86 214 125 368 196AT – Solvent inaccessible Total Surface Area
BT – Solvent inaccessible Contact Surface Area
TSA of the complexes are in the range of 3840 – 4160CSA of the complexes are in the range of 1160 – 1250
Solvent inaccessible surface areas of the complexes in Å2
Plot of interfacial interacting volume Vs Binding energy of the complex
Interacting Interfacial Volume (Å3)
Plot of effective solvent inaccessible contact volume Vs Binding energy of the complex (inset): Plot of effective solvent inaccessible contact surface area Vs Binding energy of the complex
Plot of inverse of interacting interfacial volume (1/Int.Vol.) Vs inverse of binding energy(1/B.E) of the complexes
Ligation phenomena in collagen is being influenced by geometric parameters
Collagen complexation with small polyphenolic molecules, there may exist some minimum geometrical sizes and binding energies for influencing the long range ordering processes
Ability of polyphenol bearing flavanoid structure in management of arthritis and tanning may well result from their ability to reduce accessibility of solvent(water) to molecular surfaces of collagen
The present investigation offers the possibility to understand further recognition of phenomena associated with protein-protein and DNA-protein interactions in general, based on interfacial volume and surface areas