Biochemistry, part 2
description
Transcript of Biochemistry, part 2
Biochemistry, part 2
1 Introduction
2 Theoretical backgroundBiochemistry/molecular biology
3 Theoretical backgroundcomputer science
4 History of the field
5 Splicing systems
6 P systems
7 Hairpins
8 Micro technology introductions Microreactors / Chips
9 Microchips and fluidics
10 Self assembly
11 Regulatory networks
12 Molecular motors
13 DNA nanowires
14 Protein computers
15 DNA computing - summery
Course outline
DNA folding
Many DNA molecules are circular (e.g.,
bacterial chromosomes, all plasmid DNA).
Circular DNA can form supercoils. Human
chromosome contains 3x109 basepairs and are
wrapped around proteins to form nucleosomes.
Nucleosomes are packed tightly to form helical
filament, a structure called chromotin.
RNA are much shorter but more diverse
molecules. They can form various three
dimensional structures.
DNA folding
Supercoils refer to the DNA structure in which
double-stranded circular DNA twists around
each other. Supercoiled DNA contrasts relaxed
DNA;
In DNA replication, the two strands of DNA
have to be separated, which leads either to
overwinding of surrounding regions of DNA or
to supercoiling;
A specialized set of enzymes (gyrase,
topoisomerases) is present to introduce
supercoils that favor strand separation;
The degree of supercoils can be quantitatively
described.
Tertial structure in DNA
Varieties of supercoiled DNA
The linking number L of DNA, a topological
property, determines the degree of
supercoiling;
The linking number defines the number of times
a strand of DNA winds in the right-handed
direction around the helix axis when the axis
is constrained to lie in a plane;
If both strands are covalently intact, the
linking number cannot change;
For instance, in a circular DNA of 5400
basepairs, the linking number is 5400/10=540,
where 10 is the base-pair per turn for type B
DNA.
Linking number
Twist T is a measure of the helical winding of the
DNA strands around each other. Given that DNA prefers
to form B-type helix, the preferred twist = number of
basepair/10;
Writhe W is a measure of the coiling of the axis of
the double helix. A right-handed coil is assigned a
negative number (negative supercoiling) and a left-
handed coil is assigned a positive number (positive
supercoiling).
Topology theory tells us that the sum of T and W
equals to linking number: L=T+W
For example, in the circular DNA of 5400 basepairs,
the linking number is 5400/10=540
If no supercoiling, then W=0, T=L=540;
If positive supercoiling, W=+20, T=L-W=520;
The twist and writhe
The relation between L, T and W
Positive supercoiling
The relation between L, T and W
Negative supercoiling
A relaxed circular, double stranded DNA (1600
bps) is in a solution where conditions favor
10 bps per turn. What are the L, T, and W?
During replication, part of this DNA unwinds
(200 bps) while the rest of the DNA still
favor 10 bps per turn. What are the new L, T,
and W?
L=1600/10=160W=0 (relaxed)T=L-W =160
L=160T=(1600-200)/10=140W=L-T=+20
1600 bps 1400 bps200 bps
L, T and W calculation
Nucleosomes look like “beads on a string”
under microscope. The beads contain a pair of
four histone proteins, H2A, H2B, H3, and H4
(octamer). The string is double stranded DNA;
The surface of the octamer contain features
that guide the course of DNA such that DNA can
wrap 1.65 turns around in a left-handed
conformation. H1 proteins serves to seal the
ends of the DNA and connects consecutive
nucleosomes.
nucleosomes
Nucleosomes
Organisation of chromosomes
DNA double helix
‘Beads on a string’ chromatin form
2 nm
11 nm
Base pairs per turn
Packing ratio
10
80
1
6-7
Organisation of chromosomes
Solenoid (6 nucleosomes per turn)
Loops (50 turns per loop)
30 nm
o.25 μm
Base pairs per turn
Packing ratio
1200
60,000
~40
680
Organisation of chromosomes
Miniband (18 loops)
Chromosome (stacked minibands)
o.84 μm
o.84 μm
Base pairs per turn
Packing ratio
1.1 106 1.2 104
Organisation of chromosomes
Organisation of chromosomes
proteins
4 possible bases (A, C, G, U) 3 bases in the codon 4 x 4 x 4 = 64 possible codon sequences Start codon: AUG Stop codons: UAA, UAG, UGA 61 codons to code for amino acids (AUG as well) 20 amino acids – redundancy in genetic code
Genetic code
building blocks for proteins (20 different) vary by side chain groups
Hydrophilic amino acids are water soluable Hydrophobic are not
Linked via a single chemical bond (peptide bond)
Peptide: Short linear chain of amino acids (< 30) polypeptide: long chain of amino acids (which can be upwards of 4000 residues long).
Amino acids
Glycine (G, GLY) Alanine (A, ALA) Valine (V, VAL) Leucine (L, LEU) Isoleucine (I, ILE) Phenylalanine (F, PHE) Proline (P, PRO) Serine (S, SER) Threonine (T, THR) Cysteine (C, CYS) Methionine (M, MET) Tryptophan (W, TRP) Tyrosine (T, TYR) Asparagine (N, ASN) Glutamine (Q, GLN) Aspartic acid (D, ASP) Glutamic Acid (E, GLU) Lysine (K, LYS) Arginine (R, ARG) Histidine (H, HIS) START: AUG STOP: UAA, UAG, UGA
20 amino acids
20 amino acids
20 amino acids
The basic amino acid
Peptide bond
Two amino acids
Removal of water molecule
Formation of CO-NH
Amino end Carboxyl end
Peptide bond
Peptide bond
Polypeptide
There are four basic levels of structure in protein architecture
Protein structure
Primary–sequence of amino acids constituting the
polypeptide chain
Secondary–local organization into secondary
structures such as helices and sheets Tertiary –three dimensional arrangements of the
amino acids as they react to one another due to
the polarity and resulting interactions between
their side chains
Quaternary–number and relative positions of the
protein subunits
Protein structure
Primary structure: amino acid sequence
Protein structure
Protein structure Secondary structure: α-helix and β-sheet
Carboxyl end
Amino end
Protein structure Secondary structure: α-helix and β-sheet
AntiparallelParallel
Side view Side view
Protein structure Secondary structure: α-helix and β-sheet
Protein structure Tertiary structure: spatial arrangement of amino residues
Protein structure Quaternary structure: spatial arrangement of subunits
Protein structure
tertiary quaternarysecondaryprimary
Protein structure
Every function in the living cell depends on proteins.
Motion and locomotion of cells and organisms depends on
contractile proteins. [Example: Muscles]
The catalysis of all biochemical reactions is done by enzymes,
which contain protein.
The structure of cells, and the extracellular matrix in which they
are embedded, is largely made of protein. [Example: Collagens]
Defence by antibodies.
The receptors for hormones and other signalling molecules are
proteins.
The transcription factors that turn genes on and off to guide the
differentiation of the cell and its later responsiveness to
signals reaching it are proteins.
and many more - proteins are truly the physical basis of life.
Protein function
Protein function
Protein function antibody
Protein function enzyme
Gene expression
Bacteria express only a subset of their genes at
any given time.
Expression of all genes constitutively in
bacteria would be energetically inefficient.
The genes that are expressed are essential
for dealing with the current environmental
conditions, such as the type of available
food source.
Gene regulation mechanism
Regulation of gene expression can occur at
several levels:
Transcriptional regulation: no mRNA is made.
Translational regulation: control of whether
or how fast an mRNA is translated.
Post-translational regulation: a protein is
made in an inactive form and later is
activated.
Gene regulation mechanism
Transcriptional control Translational control Post-translational control
Onset of transcription
RNA polymerase
Translation rate
Lifespan of mRNA
Ribosome
mRNA
Protein
Protein activation (by chemical modification)
Feedback inhibition (protein inhibits transcription of its own gene)
DNA
Gene regulation mechanism
Escherichia .Coli
Operon
A controllable unit of transcription
consisting of a number of structural
genes transcribed together. Contains at
least two distinct regions: the operator
and the promoter.
Gene regulation mechanism
Case study of the regulation of the lactose
operon in E. coli
E. coli utilizes glucose if it is available,
but can metabolize other sugars if glucose is
absent.
Gene regulation mechanism
Glucose : LactoseFood source:
70
60
50
3020
40
100
Relative density of cells
0 1 2 3 4 5 0 1 2 3 4 5 6
43.5
13.5
1:3
Glucose : Lactose
1:1
Glucose : Lactose
3:1
Time (hours)
29.5
26.5
0 1 2 3 4 5 6 7
14.0
39.0
Second period of rapid growth with lactose as food source
Initial period of rapid growth with glucose as food source
Gene regulation mechanism
Case study of the regulation of the lactose
operon in E. coli
Genes that encode enzymes needed to break
other sugars down are negatively regulated.
Example: enzymes required to metabolize
lactose are only synthesized if glucose is
depleted and lactose is available.
In the absence of lactose, transcription
of the genes that encode these enzymes is
repressed. How does this occur?
Gene regulation mechanism
Case study of the regulation of the lactose operon in E. coli
All the loci required for lactose metabolism are grouped together into an operon.
The lacZ locus encodes -galactosidase enzyme, which breaks down lactose.
The lacY locus encodes galactosidase permease, a transport protein for lactose.
The function of the lacA locus is unknown.
The lacI locus encodes a repressor that blocks transcription of the lac operon.
Gene regulation mechanism
Section of E. coli chromosome
(1) Lacl protein and glucose shut down transcription of lacZ and lacY
(2) Lactose induces transcription of lacZ andlacY
Regulatory function
RegulatoryproteinLacl
lacl
Cleaves lactoseto glucose and galactose
ß-galactosidase
LacZ
lacZ
E. coli
Chromosome
Glucose
Galactose
ß-galactosidase
Galactosidase permease
Lactose
Membrane transport protein-imports lactose
Galactosidase permease
lacY
LacY
Observations aboutregulation of lacZ and lacY:
Gene regulation mechanism
lacl promoter lacl Promoter Operator lacZ lacY
Lac operon
lacA
Gene regulation Lac operon
Repression and induction of the lactose operon.
The lac operon is under negative regulation,
i.e. , normally, transcription is repressed.
Glucose represses transcription of the lac
operon.
Glucose inhibits cAMP synthesis in the
cells.
At low cAMP levels, no cAMP is available
to bind CAP.
Unless CAP is bound to the CAP site in
the promoter, no transcription occurs.
Gene regulation mechanism
lacl
Functional repressor
RNA polymerase blocked
Operator (binding site for repressor)
lacZ lacY
NO TRANSCRIPTION
When no lactose is present, the repressor binds to DNA and blocks transcription.
Gene regulation mechanism
Lactose
lacl + lacZ lacY
TRANSCRIPTION BEGINS
-galactosidase
Permeaserepressor mRNA
Repressor plus lactose (an inducer) present. Transcription proceeds.
Gene regulation mechanism
lacl promoter
lacl Promoter Operator lacZ lacY lacA
RNA polymerase binds to promoter
lacZ message
"Polycistronic" mRNA
lacY message
lacA message
Operons produce mRNAs that code for functionally related proteins.
Gene regulation mechanism
DNA binding sites
Proteins that bind to DNA share similarity in
the structure of their DNA-binding regions.
Many DNA binding proteins, such as lac
repressor, have a helix-turn-helix motif which
fits into the major groove of a DNA molecule
DNA binding proteins
(a) (b) (c)
DNA binding proteins
Binding of an inducer to the lac repressor
causes it to release the operator DNA because it
alters the conformation of the helix-turn-helix
motif.
DNA binding proteins
DNA binding proteins
DNA binding proteins
DNA binding proteins
Information about regulation of the expression
of genetic loci may help to combat diseases.
Virulent bacterial strains have genes that
encode the ability to infect and produce
disease.
Knowledge of how the expression of these
genes is controlled and regulated may
provide insights into blocking the
development of the disease.
DNA binding proteins
When tryptophan is absent, transcription occurs.
RNA polymerase
Promoter
Leader
5 coding loci
When tryptophan is present, transcription is blocked.
Tryptophan
Repressor
DNA binding proteins, negative regulation
Ribosomes translatesmRNA rapidly whentryptophan is abundant,…
…leading to formation of stem-and-loop structure that inhibitsRNA polymerase and terminates transcription.
DNA binding proteins