Roadmap The topics: basic concepts of molecular biology more on Perl overview of the field ...
-
date post
21-Dec-2015 -
Category
Documents
-
view
220 -
download
0
Transcript of Roadmap The topics: basic concepts of molecular biology more on Perl overview of the field ...
RoadmapRoadmap
The topics:The topics: basic concepts of molecular biologybasic concepts of molecular biology more on Perlmore on Perl overview of the fieldoverview of the field biological databases and database biological databases and database
searchingsearching sequence alignmentssequence alignments phylogeneticsphylogenetics structure predictionstructure prediction microarray data analysismicroarray data analysis
Protein Protein SynthesiSynthesi
ss
the national health museum
ProteinsProteins
ProteinsProteinsProteins perform a vast array of biological
functions including:
Transport: hemoglobin (delivers O2 to lungs) Mechanical support: collagen Storage: ferritin (stores iron) Regulation: repressor proteins (gene expression) Antibodies: immunoglobulin Catalysis: SOD (superoxide dismutase) …
Misfold:Misfold:mad cow disease, Alzheimer's disease, … mad cow disease, Alzheimer's disease, …
Amino acid compositionAmino acid composition
Basic Amino AcidBasic Amino AcidStructure:Structure: The side chain, R,The side chain, R,
varies for each ofvaries for each ofthe 20 amino acidsthe 20 amino acids
C
RR
C
H
NO
OHH
H
Aminogroup
Carboxylgroup
Side chain
The Peptide BondThe Peptide Bond
Dehydration synthesisDehydration synthesis Polypeptide with repeating backbone: NPolypeptide with repeating backbone: N–C–C –C ––C –NN–C–C –C–C
Side chain propertiesSide chain properties
What make amino acids having different properties ?
CarbonCarbon does not make hydrogen bonds with does not make hydrogen bonds with water easily – water easily – hydrophobichydrophobic
O and NO and N are generally more likely than C to are generally more likely than C to h-bond to water – h-bond to water – hydrophilichydrophilic
The amino acids forms three general groups:The amino acids forms three general groups: HydrophobicHydrophobic PolarPolar Charged (positive/basic & negative/acidic)Charged (positive/basic & negative/acidic)
The Hydrophobic Amino The Hydrophobic Amino AcidsAcids
Proline severelyProline severelylimits allowablelimits allowableconformations!conformations!
The Charged Amino The Charged Amino AcidsAcids
Krane & Raymer
The Polar Amino AcidsThe Polar Amino Acids
Krane & Raymer
More Polar Amino AcidsMore Polar Amino Acids
and
Peptidyl polymersPeptidyl polymers A few amino acids in a chain are called a A few amino acids in a chain are called a
polypeptidepolypeptide. A . A proteinprotein is usually is usually composed of 50 to 400+ amino acids.composed of 50 to 400+ amino acids.
Primary & Secondary Primary & Secondary StructureStructure
Primary structurePrimary structure = the linear = the linear sequencesequence of amino acids comprising a protein:of amino acids comprising a protein:
AGVGTVPMTAYGNDIQYYGQVT…AGVGTVPMTAYGNDIQYYGQVT…
Secondary structureSecondary structure Regular patterns of hydrogen bonding in Regular patterns of hydrogen bonding in
proteins result in two patterns that emerge in proteins result in two patterns that emerge in nearly every protein structure known: the nearly every protein structure known: the --helixhelix and the and the --sheetsheet
The location of direction of these periodic, The location of direction of these periodic, repeating structures is known as the repeating structures is known as the secondary structuresecondary structure of the protein of the protein
Levels of Levels of Protein Protein
StructureStructure
Secondary structure Secondary structure elements combine to elements combine to form form tertiary tertiary structurestructure
Quaternary structureQuaternary structure occurs in multi-enzyme occurs in multi-enzyme complexescomplexes Many proteins are active Many proteins are active
only as homodimers, only as homodimers, homotetramers, etc.homotetramers, etc.
Dihedral anglesDihedral angles
HelixHelix Most abundant secondary structureMost abundant secondary structure 3.6 amino acids per turn 3.6 amino acids per turn Hydrogen bond formed between every fourth Hydrogen bond formed between every fourth
residereside Avg length: 10 amino acids, or 3 turnsAvg length: 10 amino acids, or 3 turns Varies from 5 to 40 amino acidsVaries from 5 to 40 amino acids
HelixHelix Normally found on the surface of protein coresNormally found on the surface of protein cores
Interact with aqueous environmentInteract with aqueous environment
Inner facing side has hydrophobic amino acidsInner facing side has hydrophobic amino acids
Outer-facing side has hydrophilic amino acidsOuter-facing side has hydrophilic amino acids
Every third amino acid tends to be hydrophobicEvery third amino acid tends to be hydrophobic
Pattern can be detected computationallyPattern can be detected computationally
Rich in alanine (A), gutamic acid (E), leucine (L), Rich in alanine (A), gutamic acid (E), leucine (L), and methionine (M)and methionine (M)
Poor in proline (P), glycine (G), tyrosine (Y), and Poor in proline (P), glycine (G), tyrosine (Y), and serine (S)serine (S)
SheetSheet
SheetSheet Hydrogen bonds between 5-10 consecutive amino Hydrogen bonds between 5-10 consecutive amino
acids in one portion of the chain with another 5-10 acids in one portion of the chain with another 5-10 farther down the chainfarther down the chain
Interacting regions may be adjacent with a short Interacting regions may be adjacent with a short loop, or far apart with other structures in betweenloop, or far apart with other structures in between
Directions:Directions: Same: Parallel SheetSame: Parallel Sheet Opposite: Anti-parallel SheetOpposite: Anti-parallel Sheet Mixed: Mixed SheetMixed: Mixed Sheet
Alpha carbons (and R side groups) alternate above Alpha carbons (and R side groups) alternate above & below the sheet& below the sheet
Prediction difficult, due to wide range of Prediction difficult, due to wide range of and and anglesangles
Ramachandran Plot Ramachandran Plot (alpha)(alpha)
Ramachandran Plot Ramachandran Plot (beta)(beta)
Ramachandran PlotRamachandran Plot
Helices and SheetsHelices and Sheets
LoopLoop
Regions between Regions between helices and helices and sheets sheets
Various lengths and three-dimensional Various lengths and three-dimensional configurationsconfigurations
Located on surface of the structureLocated on surface of the structure
Hairpin loops: complete turn in the polypeptide Hairpin loops: complete turn in the polypeptide chain, (anti-parallel chain, (anti-parallel sheets) sheets)
More variable sequence structureMore variable sequence structure
Tend to have charged and polar amino acidsTend to have charged and polar amino acids
CoilCoil
Region of secondary structure that is not Region of secondary structure that is not a helix, sheet, or loopa helix, sheet, or loop
Determining Protein Determining Protein StructureStructure
There are O(100,000) distinct proteins There are O(100,000) distinct proteins in human proteome.in human proteome.
Two methods for revealing positions of Two methods for revealing positions of atoms in 3-D:atoms in 3-D: X-Ray CrystallographyX-Ray Crystallography
X-ray diffraction pattern + mathematical X-ray diffraction pattern + mathematical constructionconstruction
Good protein crystal needed, good resolution of Good protein crystal needed, good resolution of diffraction neededdiffraction needed
Nuclear Magnetic ResonanceNuclear Magnetic Resonance Small proteins only (< 250 residues)Small proteins only (< 250 residues) Inter-proton distances + geometric constraintsInter-proton distances + geometric constraints
Bovine RibonucleaseBovine Ribonuclease
Christian Anfinsen, 1957.Christian Anfinsen, 1957.
Disulfide BondsDisulfide Bonds
Two cysteines in Two cysteines in close proximity close proximity will form a will form a covalentcovalent bond bond
Disulfide bond, Disulfide bond, disulfide bridge, disulfide bridge, or dicysteine or dicysteine bond.bond.
Significantly Significantly stabilizes stabilizes tertiary tertiary structure.structure.
Principles that govern the folding Principles that govern the folding of protein chains - of protein chains - Christian Anfinsen, Christian Anfinsen,
Science 1973Science 1973
RibonucleaseRibonuclease
Disulfide BondsDisulfide Bonds
661212
551010
4488
3366
2244
# of combinations# of combinations# of S-S bonds# of S-S bonds# of cysteines# of cysteines
1039510395
945945
105105
1515
33
Levinthal’s Levinthal’s paradoxparadox
How do proteins find the right conformation out of the simply endless number of potential three-dimensional forms that it could randomly fold into?
Consider a 100 residue protein. If each residue can take only 3 positions, there are ?possible conformations. If it takes 10-13s to convert from 1 structure to
another, exhaustive search would take ? years!
3100 = 5 1047
1.6 1027
Current Opinion in Structural Biology, 2004, 14, 70-75
What determines fold?What determines fold?
Anfinsen’s experiments in 1957 demonstrated Anfinsen’s experiments in 1957 demonstrated that proteins can fold spontaneously into their that proteins can fold spontaneously into their native conformations under physiological native conformations under physiological conditions. This implies that primary structure conditions. This implies that primary structure does indeed determine folding or 3-D does indeed determine folding or 3-D structure.structure.
Exceptions existExceptions exist Chaperone Chaperone proteins assist foldingproteins assist folding Abnormally folded Abnormally folded Prion Prion proteins can catalyze proteins can catalyze
misfolding of normal misfolding of normal prionprion proteins that then proteins that then aggregateaggregate
Other factorsOther factors
Physical properties of protein that Physical properties of protein that influence stability & therefore, determine influence stability & therefore, determine its fold:its fold: Rigidity of backboneRigidity of backbone
Amino acid interaction with waterAmino acid interaction with water Hydropathy index for side chainsHydropathy index for side chains
Interactions among amino acidsInteractions among amino acids Electrostatic interactionsElectrostatic interactions
Hydrogen, disulphide bondsHydrogen, disulphide bonds
Volume constraintsVolume constraints
Understand protein folding
Structure: Given a sequence, what tertiary structure does it adopt? Global optimization, Monte Carlo, Molecular dynamics,
Coarse-grained dynamics, etc.
Thermodynamics: under mutation does the free energy of the native state change relative to native sequence? MC, MD, Free energy methods, etc.
Kinetics: how fast does the protein fold? Does a different sequence fold faster and why? Lattice Monte Carlo, Molecular dynamics, Coarse-
grained dynamics
CASP changed the CASP changed the landscapelandscape
Critical Assessment of Structure Prediction Critical Assessment of Structure Prediction competition. Even numbered years since 1994competition. Even numbered years since 1994 Solved, but unpublished structures are posted in May, Solved, but unpublished structures are posted in May,
predictions due in Septemberpredictions due in September Various categoriesVarious categories
Relation to existing structures, Relation to existing structures, ab initioab initio, homology, fold, , homology, fold, etc.etc.
Partial vs. Fully automated approachesPartial vs. Fully automated approaches Produces lots of information about what aspects of the Produces lots of information about what aspects of the
problems are hard, and ends arguments about test sets.problems are hard, and ends arguments about test sets. Results showing steady improvement, and the Results showing steady improvement, and the
value of integrative approaches.value of integrative approaches.