Bioinformatics: Practical Application of Simulation and Data Mining Markov Modeling III

Post on 18-Mar-2016

16 views 0 download

Tags:

description

Bioinformatics: Practical Application of Simulation and Data Mining Markov Modeling III. Prof. Corey O’Hern Department of Mechanical Engineering & Materials Science Department of Physics Yale University. 1. “Using massively parallel simulation and Markovian - PowerPoint PPT Presentation

Transcript of Bioinformatics: Practical Application of Simulation and Data Mining Markov Modeling III

Bioinformatics: Practical Application of Simulation and Data

Mining

Markov Modeling III

Prof. Corey O’HernDepartment of Mechanical Engineering & Materials

ScienceDepartment of Physics

Yale University

1

“Using massively parallel simulation and Markovianmodels to study protein folding: Examining the dynamics

of the villin headpiece,” J. Chem. Phys. 124 (2006) 164902.

2

Villin headpiece-HP-36

MLSDEDFKAVFGMTRSAFANLPLWKQQNLKKEKGLF: PDB 1 VII

3

50,000 trajectories *10ns/trajectory = 500 s

•Gromacs with explicit solvent (5000 water molecules)and eight counterions; Amber + bond constraints; T=300K

Simulation Details

4

50,000 trajectories*10 ns/trajectory*1 conformation/100 ps = 4,509,355 conformations

I. Native State Ensemble

5

II. Unfolded State Ensemble•10,000 trajectories equilibrated at T=1000K for 1 ns•Remove all structure•Random walk statistics

P R( )=4πR 2

23π R 2

( )3/2 exπ −

3R 2

2 R 2

⎣⎢⎢

⎦⎥⎥

end-to-end distance

R2 ~N1/2

N 3/5

⎧⎨⎩

idealexcludedvolume

N= # of amino acids

chaincrossing

•Each trajectory quenched from 1000K to 300K; run for 25 ns 6

Estimation of Folding Time: Including Unfolded Events

τ −1 =N f

Ntrajectoriest fi + tu

i

i∈U∑

i∈F∑⎡⎣⎢

⎤⎦⎥

−1

Initially unfolded

states

F: folded

U: unfolded

first passagetime: tf

tu

7

determined bydRMSD

Floppy Residues

8

Maximum Likelihood Estimator (MLE)

τF 4.3-10 s from laser-jump and other experiments

τF 8 s from MLE

τF 24 s from MLE + correction of water diffusioncoefficient

Sensitivity of MLE Results

“With these issues in mind, the calculated rate is wellwithin an order of magnitude of expeirmental measurements.”

III. Transition State Ensemble: Effect of Perturbations

PX ,Y =N X( )

N X( )+ N Y( )

PX ,Y s( )=PX,Y s'( ) s’: perturbed state after 500 pss: unperturbed state

11

N(X)= # of trajectories that meet condition X before Y

Water does notaffect dynamics

Markov States

• 4,509,355 conformations 2454 Markov statesbased on clustering of C dRMSD

sf

•No dead ends

s1

s4

s3

s2

12

C dRMSD

dRMSDij =

0 dRMSD12 dRMSD13 dRMSD14

dRMSD12 0 dRMSD 23 dRMSD 24

dRMSD13 dRMSD 23 0 dRMSD 34

dRMSD14 dRMSD 24 dRMSD 34 0

⎢⎢⎢⎢

⎥⎥⎥⎥

dRMSDij =

1N

rrki −

rrkj( )

2

k∑⎡

⎣⎢⎤⎦⎥1/2

k=sum over amino acidsi,j=configurations

13

Transition Probabilites and Mean First Passage Time

P sa , sb( )=T s,sb( )T s,si( )

i∑

14

stableMFPT

MFPT=3 s

MSM: single exponential

Comparison of Short and Long Times

15

QuickTime™ and aPhoto - JPEG decompressor

are needed to see this picture.

QuickTime™ and aPhoto - JPEG decompressor

are needed to see this picture.

First Passage Time in Random Processes

foldedunfolded

unfolded folded partially unfolded

16

Dx

P(Dx)

Gaussian

Survival Probability for Two Particles

17

Protein Aggregation

“Molecular simulation of protein aggregation,”Biotechnology & Bioengineering 96 (2007) 1.

18