Week 1 Lecture 1: Introduction to cells and their contents. Proteins, polypeptide chains made from...

49
Week 1 •Lecture 1: Introduction to cells and their contents. Proteins, polypeptide chains made from 20 amino acids. Paradigm shift in Molecular Biology from soft (descriptive) to hard (predictive) science. •Lecture 2: Quantum, classical and stochastic description of biomolecules. Classical molecular dynamics as the standard model of molecular biology.

Transcript of Week 1 Lecture 1: Introduction to cells and their contents. Proteins, polypeptide chains made from...

Week 1

•Lecture 1: Introduction to cells and their contents. Proteins,

polypeptide chains made from 20 amino acids. Paradigm shift in

Molecular Biology from soft (descriptive) to hard (predictive) science.

•Lecture 2: Quantum, classical and stochastic description of

biomolecules. Classical molecular dynamics as the standard model of

molecular biology.

Cells

Cells are the fundamental structural and functional units in organisms.

They are the subject of Molecular Biology, Biochemistry & Biophysics

with significant overlaps among the disciplines.

This course will focus on computational aspects cellular biophysics.

Two kinds of cells:

• Prokaryotes (single cells, bacteria, e.g. Escherichia coli)

Size: 1 m (micrometer), thick cell wall, no nucleus

The first life forms. Simpler molecular structures, hence easier to study.

• Eukaryotes (everything else)

Size: 10 m, no cell wall (animals), has a nucleus,

Organelle: subcompartments that carry out specific tasks

e.g. mitochondria produces ATP from metabolism (the energy currency)

chloroplast produces ATP from sunlight.

Structure of a typical cell

Plasma membrane

Background: salt water

Water (70%) is a highly viscous

medium. It also has a very

high dielectric constant (=80),

which screens charges.

Ions (Na, K, Cl,…)

Typical concentration 0.15 M

Debye length: 8 Å

Mobile ions completely screen

charges beyond few nm.

no directed motion is

possible

in cells beyond few nm!

Organic molecules

Hydrocarbon chains

(hydrophobic)

Double bonds

Functional groups in organic molecules

Polar groups are hydrophilic. When attached to hydrocarbons,

they modify their behaviour.

Four classes of macromolecules in cells

•Sugars (polysaccharides) :

Functions: provide energy, scaffold in DNA and RNA structure, confer proteins

stability after translation.

•Lipids (triglycerides)

Functions: long-term energy storage, lipid bilayer forms the cell membrane

•Proteins (polypeptides)

Functions: main workhorses in cells, perform all the mechanical and chemical

operations, signal transduction, also provide structural elements.

•Nucleic acids (DNA, RNA)

Functions: Carries the genetic code, double-helix splits to replicate, contains

the blueprint for all proteins.

Simple sugars (monosaccharides): e.g. ribose (C5H10O5),

Six-carbon sugars

Glucose is a product of photosynthesis.

Glucose and fructose have the same formula (C6H12O6) but

different structure

Disaccharides are formed when two monosaccharides are chemically

bonded together.

Lipids (fatty acids)

Saturated fatty acids

Unsaturatedfatty acid(C=C bonds)

Phospholipids (lipids with a phosphate head group)

Phosphatide:

In neutral pH (7), the oxygens in

the OH groups are deprotonated,

leading to a negatively charged

membrane.

Phosphatidylcholine (PC):

The most common phospholipid

has a choline group attached

….PO4CH2CH2N+(CH3)3

Proteins (polypeptide chains folded into functional forms)

The building blocks of proteins are the 20 amino acids.

-

-

pH

H

pH

COOCNH

COOCNH

COOHCNH

2

3

3

10

210

2

At normal pH, amino acids are neutralbut have +/- charges at the amino/carboxyl groups (zwitter ions).

In water

Gas phase

Formation of polypeptides

OHCOOCNHCOCNH

COOCNHCOOCNH

23

33

-

--

In water:

Gas phase reaction

Protein structure

3.6 amino acids per turn, r=2.5 Å pitch (rise per turn) is 5.4 Å

-helix

-sheet

Nucleic acids are formed from ribose+phosphate+base pairs

The base pairs are A-T and C-G in DNA (A pairs only with T and

C pairs only with G).

In RNA Thymine is substituted by Uracil

Adenosine triphosphate (ATP) has three phosphate groups.

In the usual nucleotides, there is only one phosphate group

which is called Adenosine monophosphate (AMP)

Another important variant is Adenosine diphosphate (ADP)

B-DNA (B helix)

ROM (Read-Only Memory) contains1.5 Gigabyte of genetic information

Base pairs per turn (3.4 nm): 10

Primary structure of

a single strand of DNA

Primary structure of

a single strand of RNA

Hydrogen bonds

among the base

pairs A-T and C-G

Local structure of DNA

Dynamic and flexible

structure

Bends, twists and knots

Essential for packing 1 m

long DNA in 1 m long

nucleus

Central dogma

Paradigm shift in Molecular Biology

BBC news, July 3, 2104: 99.6% of drug trials for Alzheimer’s disease

during the last decade have failed (i.e., only 1 out of 250

succeeded).

This is not specific to Alzheimer’s disease but pervades the whole

pharma. All the low-lying fruits have been picked and to find novel

drugs one has to do more than trial-and-error work. The answer is in

rational drug design which combines experimental work with

computational models of drug action.

Biomolecular systems are quite complex and their accurate modelling

requires a great deal of computing power. This has become feasible

in the last decade with the advance of the High Performance

Computing systems based on parallel clusters of PCs, which are

more affordable. Computational work has now become cheaper and

less laborious than performing experiments.

The main barrier in turning Molecular Biology in to a hard science like

Physics and Chemistry is convincing people that biomolecular systems

can be accurately described using computational methods.

Chemical accuracy has been achieved in relatively few examples so far

and much more work (both in applications and methodological

development) needs to be done to complete the paradigm shift.

Scientific method

Quantum, classical and stochastic description

of biomolecular systems

• Quantum mechanics (Schroedinger equation)

The most fundamental approach but feasible only for few atoms (~10).

Approximate methods (e.g. density functional theory) allows treatment of

larger systems (~1000) and dynamic simulations for several picoseconds.

• Classical mechanics (Newton’s equation of motion)

Most atoms are heavy enough to justify a classical treatment (except H).

The main problem is finding accurate potential functions (force fields).

MD simulation of over 100,000 atoms for microseconds is now feasible.

• Stochastic mechanics (Langevin equation)

Most biological processes occur in the range of microseconds to seconds.

Thus to describe such processes, a simpler (coarse-grained)

representation of atomic system is essential (e.g. Brownian dynamics).

,

2

22

2

22

2

2

2

,,

i i

ine

e

ji ji

ji

ii

in

iineen

ezU

em

H

ezz

MH

EUHH

rR

rr

RR

rRrR

Many-body Schroedinger equation for a molecular system

Here m and Mi are the mass of the electrons and nuclei,

r and Ri denote the electronic and nuclear coordinates, and

and i denote the respective gradients.

where: nuclear Hamiltonian

electronic Hamiltonian

elect-nucl. interaction

(1)

rRRrR ,, ieini

rRRrRR ,, ieinieinneen EUHH

Separation of the electronic wave function

Nuclei are much heavier and hence move much slower than electrons.

This allows decoupling of their motion from those of electrons.

Introduce the product wave function:

Substituting this in the Schroedinger equation gives

For fixed nuclei, the electronic part gives

rRRrR ,, ieieienee EUH (2)

rRRrRRR ,, ieinieinien EEH

Substitute the electronic part back in the Schroedinger equation

Eqs. (2, 3) need to be solved simultaneously, which is a formidable

problem for most systems. For two nuclei, there is only one coordinate

for R (the distance), so it is feasible. But for three-nuclei, there are 4

coordinates (in general for N nuclei, 3N-5 coordinates are required),

which makes numerical solution very difficult.

Born-Oppenheimer (adiabatic) approximation consists of neglecting

the cross terms arising from

(which are of order m/M), so that the nuclear part becomes

rR ,ienH

ininien EEH RRR (3)

ieji ji

jii

iii

i

Eezz

U

NiUdt

dM

RRR

R

RR

2

2

2

,,1

Classical approximation for nuclear motion

Nuclei are heavy so their motion can be described classically, that is,

instead of solving the Schroedinger Eq. (3), we solve the corresponding

Newton’s eq. of motion

At zero temperature, the potential can be minimized with respect to the

Nuclear coordinates to find the equilibrium conformation of molecules.

At finite temperature, Eqs. (2) and (4) form the basis of ab initio MD

(ignores quantum effects in nuclear motion and electronic exc. at finite T.)

(4)

Methods of solution for the electronic equation

Two basic methods of solution:

1. Hartree-Fock (HF) based methods: HF is a mean field theory.

One finds the average, self-consistent potential in which electrons move.

Electron correlations are taken into account using various methods.

2. Density functional theory: Solves for the density of electrons.

Better scaling than HF (which is limited to ~10 atoms); 1000’s of atoms.

Car-Parrinello MD (DFT+MD) has become popular in recent years.

(5)

rRR

rRrRrr

,

,2 ,

222

2

ieie

iei i

i

i

E

eze

m

Electronic part of the Schroedinger Equation (2) has the form

Nidt

dM

N

ijiji

ii ,,1,

2

2

FF

r

Classical mechanics

Molecular dynamics (MD) is the most popular method for simulation

studies of biomolecules. It is based on Newton’s equation of motion.

For N interacting atoms, one needs to solve N coupled DE:

Force fields are determined from experiments and ab initio methods.

Analytically this is an intractable problem for N>2.

But we can solve it easily on a computer using numerical methods.

Current computers can handle N=~106 particles, which is large enough

for description of most biomolecules.

Integration time, however, is still a bottleneck (106 steps @ 1 fs = 1 ns)

iiiiii

i mdt

dm RvF

r 2

2

Stochastic mechanics (Brownian dynamics)

In order to deal with the time bottle-neck in MD, one has to simplify the

simulation system (coarse graining). This can be achieved by

describing parts of the system as continuum with dielectric constants.

Examples:

• transport of ions in electrolyte solutions (water → continuum)

• protein folding (water → continuum)

• ion channels (lipid, protein, and water → continuum)

To include the effect of the atoms in the continuum, modify Newton’s

eq. of motion by adding frictional and random forces:

Langevin equation:

1.05002

3

2

1 2

v

vkTmv and m/s From

tt

et

et

1)(

)(

0

0

vr

vv

vv

vr

dt

dm

dt

dm

2

2

Frictional forces:

Friction dissipates the kinetic energy of a particle, slowing it down.

Consider the simplest case of a free particle in a viscous medium

Solution with the initial values of

In liquids frictional forces are quite large, e.g. in water 1/ 20 fs

0)0(,)0( 0 rvv

0)()0( tRv ji

zyxiRi ,,,0

2. Uncorrelated with prior velocities

Random forces:

Frictional forces would dissipate the kinetic energy of a particle rapidly.

To maintain the average energy of the particle at 1.5 kT, we need to

kick it with a random force at regular intervals.

This mimics the collision of the particle with the surrounding particles,

which are taken as continuum and hence not explicitly represented.

Properties of random forces:

1. Must have zero mean (white)

3. Uncorrelated with prior forces

(Markovian assumption)

ijji tkTmtRR )(2)()0(

Fluctuation-dissipation theorem:

Because the frictional and random forces have the same origin,

they are related

dttRRkT

m

)()0(2

1

R(0)R(t )

In liquids the decay time is very short, hence one can approximate

the correlation function with a delta function

t

)(2)()0( tkTmtRR

t

kTmRi

22

kTmvkT

mNvg ii 2exp2

)( 2

22

22exp

2

1)( ii

i

i RRR

Rw

Random forces have a Gaussian probability distribution

This follows from the fact that the velocities have a Gaussian distribution

In order to preserve this distribution, the random forces must be

distributed likewise.

The standard model of biomolecules: MD

• MD is necessary because:

1. QM is too slow and can handle only very small systems.

2. Stochastic dynamics eliminates water from the system. But water is

not just a passive spectator in biomolecular processes - it plays an

active and essential role in the dynamics. For example, accurate

calculation of free energies is impossible without explicit description of

water (except in a few lucky cases where errors cancel out).

• Also MD is sufficient because atoms are heavy enough to justify a

classical treatment (except H). The only requirement is that accurate

potential functions must be used, which is not quiet satisfied at present;

polarization int. is not explicitly included in most force fields.49