Protein Molecule Simulation on the Grid

19
https://engage.cpc.wmin.ac.uk Protein Molecule Simulation on the Grid G-USE in ProSim Project Tamas Kiss [email protected] Joint EGGE and EDGeS Summer School on Grid Application Support Budapest, Hungary, 3 rd July 2009

description

G-USE in ProSim Project. Protein Molecule Simulation on the Grid. Tamas Kiss [email protected] Joint EGGE and EDGeS Summer School on Grid Application Support Budapest, Hungary, 3 rd July 2009. The biological interest. The motivation: - PowerPoint PPT Presentation

Transcript of Protein Molecule Simulation on the Grid

Page 1: Protein Molecule Simulation on the Grid

https://engage.cpc.wmin.ac.uk

Protein Molecule Simulation on the Grid

G-USE in ProSim Project

Tamas [email protected]

Joint EGGE and EDGeS Summer School on Grid Application SupportBudapest, Hungary, 3rd July 2009

Page 2: Protein Molecule Simulation on the Grid

https://engage.cpc.wmin.ac.uk

The biological interest• The motivation:

• Understanding how sugars interact with their protein partners may lead to development of new treatment methods for many diseases.

• The obstacle:• Investigation of the binding of proteins to sugars in “wet

laboratory” (in vitro) experiments is expensive and time consuming

• Expensive substrates• Sophisticated machinery

• The solution: • Use “in silico” tools (computer simulation) to select best

binding candidates• In vitro work only on selected candidates

Page 3: Protein Molecule Simulation on the Grid

https://engage.cpc.wmin.ac.uk

The biological interest

Binding pocket

Sugar (ligand)

Protein (receptor)

Page 4: Protein Molecule Simulation on the Grid

https://engage.cpc.wmin.ac.uk

The biological interest

• Advantages of in silico methods:• Better focusing wet laboratory resources:

• Better planning of experiments by selecting best molecules to investigate in vitro

• Reduced time and cost• Increased number of molecules screened

• Problems of in silico experiments:• Time consuming

• Weeks or months on a single computer• Simulation tools are too complex for an average bio-scientist

• Unix command line interfaces• Bio-molecular simulation tools are not widely tested and validated

• Are the results really useful and accurate?

Page 5: Protein Molecule Simulation on the Grid

https://engage.cpc.wmin.ac.uk

What can we gain via the simulation?1. Validation and refinement of in-silico modelling tools

2. Filter potential scenarios for wet lab experiments

Page 6: Protein Molecule Simulation on the Grid

https://engage.cpc.wmin.ac.uk

The biological interest

• What does the biologist want?• Run the simulations faster

• Use Grid resources• Run the simulations from a user friendly interface

• Web based interface • Combine many simulation, analysis and visualisation tools into a

workflow

Page 7: Protein Molecule Simulation on the Grid

https://engage.cpc.wmin.ac.uk

ProSim – Protein Molecule Simulation on the Grid

• Funded by the JISC- ENGAGE program• Engaging Research with e-Infrastructure • promote the greater engagement of academic researchers in the UK with

the UK's e-Infrastructure

• Prosim objectives:– define user requirements and user scenarios of protein molecule

simulation

– Identify, test and select software packages for protein molecule simulation

– automate the protein molecule simulation creating workflows and parameter study support.

– develop application specific graphical user interfaces

– run protein molecule simulation on the UK National Grid Service and make it available for the bioscience research community.

Page 8: Protein Molecule Simulation on the Grid

https://engage.cpc.wmin.ac.uk

The User ScenarioPDB file 1(Receptor) PDB file 2

(Ligand)

Energy Minimization(Gromacs)

Validate(Molprobity)

Check(Molprobity)

Perform docking(AutoDock)

Molecular Dynamics(Gromacs)

Phase 1

Phase 2

Phase 3

Phase 4

Page 9: Protein Molecule Simulation on the Grid

https://engage.cpc.wmin.ac.uk

The User Scenario in detail

Public repository

Local database

User provided

Preparation and standardisation

Solvation and charge

neutralization

Energy minimization

Validation

phase 1 – selection and preparation of receptor

Solvation

Energy minimization

Built using

SMILESPublic

repositoryLocal

databaseUser

provided

phase 2 – selection and preparation of ligand

Page 10: Protein Molecule Simulation on the Grid

https://engage.cpc.wmin.ac.uk

The User Scenario

Prepare docking: docking parameters and grid-space -

AutoGrid

Docking and selection of best results according to total

energyAutoDock

10 AutoDock executions, 100 genetic algorithm

runs each

phase 3 – docking ligand to receptor

Solvation of the ligand-receptor structure

Energy minimisation – GROMACS

Molecular dynamicsGROMACS MPI version

Molecule trajectory data analysis

phase 4 – refining the ligand-receptor molecule (performed

on 10 best results of the AutoDock simulation)

Page 11: Protein Molecule Simulation on the Grid

https://engage.cpc.wmin.ac.uk

The Workflow in g-USE

• a combination of GEMLCA and standard g-USE jobs

• Executed on 5 different sites of the UK NGS

• Parameter sweeps in phases 3 and 4

Page 12: Protein Molecule Simulation on the Grid

https://engage.cpc.wmin.ac.uk

Running simulationsSet input parameters

Upload input filesSelect executor sites

Follow execution progress

Typical execution time: 24 hours

Page 13: Protein Molecule Simulation on the Grid

https://engage.cpc.wmin.ac.uk

User views

• Biologist end-user• Minimal computer and g-USE skills• Only interested in running her own reserach• Import, parameterize, execute and visualise workflows only

• Expert user • g-USE and computer literate biologist • Modify workflows• Design new experiments• Communicate end-user request towards IT team

Page 14: Protein Molecule Simulation on the Grid

https://engage.cpc.wmin.ac.uk

The ProSim visualiser• Visualisation in a newly developed portlet• Allows visualisation of receptor, ligand and docked

molecules at any phase during and after simulation (if the necessary files have already been generated)

• Allows to visualise and compare two molecules at a time.

• Energy, pressure, temperature and other important statistics statistics are also displayed.

• Using the KiNG ((Kinemage, Next Generation) visualisation tool

Page 15: Protein Molecule Simulation on the Grid

https://engage.cpc.wmin.ac.uk

The ProSim visualiser

Page 16: Protein Molecule Simulation on the Grid

https://engage.cpc.wmin.ac.uk

The ProSim visualiser

Page 17: Protein Molecule Simulation on the Grid

https://engage.cpc.wmin.ac.uk

Lessons learned• Communication between scientists and Grid experts is

extremely difficult• More than 50% of total time spent for the project is for

communication and describing/understanding user requests/requirements

• Novice Grid users require totally transparent access to Grid resources• User is interested in her science and not in MPI, Globus or

WMS.

Page 18: Protein Molecule Simulation on the Grid

https://engage.cpc.wmin.ac.uk

Future plans

• Make workflow more flexible to accommodate numerous different user scenarios

• Investigate further scenarios such as virtual screening of many ligands to one selected receptor

Page 19: Protein Molecule Simulation on the Grid

https://engage.cpc.wmin.ac.uk

Thank you for your attention!Any questions?

https://engage.cpc.wmin.ac.uk

[email protected]