How is python used in biomolecular sciences? · pdb api Schrödinger suite RDkit OpenMM tools Amber...
Transcript of How is python used in biomolecular sciences? · pdb api Schrödinger suite RDkit OpenMM tools Amber...
![Page 1: How is python used in biomolecular sciences? · pdb api Schrödinger suite RDkit OpenMM tools Amber tools Python […] pdb2gmx mostly C++ commandline TCL, bash, python, Perl … get](https://reader033.fdocuments.us/reader033/viewer/2022060807/608c3e88b51d92504d1ed5e6/html5/thumbnails/1.jpg)
Europython 2018 — Edinburgh 25/07/2018
Antonia Mey [email protected]
@ppxasjsm
How is python used in biomolecular sciences?
1
L. Hedges C. Woods
![Page 2: How is python used in biomolecular sciences? · pdb api Schrödinger suite RDkit OpenMM tools Amber tools Python […] pdb2gmx mostly C++ commandline TCL, bash, python, Perl … get](https://reader033.fdocuments.us/reader033/viewer/2022060807/608c3e88b51d92504d1ed5e6/html5/thumbnails/2.jpg)
What is computational biosomething?
�2
X-ray structure of a protein: set of x,y,z spatial coordinates of atoms
42 x10-10 m
![Page 3: How is python used in biomolecular sciences? · pdb api Schrödinger suite RDkit OpenMM tools Amber tools Python […] pdb2gmx mostly C++ commandline TCL, bash, python, Perl … get](https://reader033.fdocuments.us/reader033/viewer/2022060807/608c3e88b51d92504d1ed5e6/html5/thumbnails/3.jpg)
Typical questions tackled with MD
�3
many potential compounds…
How well do small molecules inhibit a protein’s functionality?
How do proteins fold?
How do proteins interact with each other?
![Page 4: How is python used in biomolecular sciences? · pdb api Schrödinger suite RDkit OpenMM tools Amber tools Python […] pdb2gmx mostly C++ commandline TCL, bash, python, Perl … get](https://reader033.fdocuments.us/reader033/viewer/2022060807/608c3e88b51d92504d1ed5e6/html5/thumbnails/4.jpg)
Timescales in biology
�4
Timescales of processes
NMR
Molecular Dynamics
Other experimental techniques: e.g. cryo EM
![Page 5: How is python used in biomolecular sciences? · pdb api Schrödinger suite RDkit OpenMM tools Amber tools Python […] pdb2gmx mostly C++ commandline TCL, bash, python, Perl … get](https://reader033.fdocuments.us/reader033/viewer/2022060807/608c3e88b51d92504d1ed5e6/html5/thumbnails/5.jpg)
Generate timeseries to compare to experiments
�5
F113
200 ns of protein dynamics time trace of an angle
![Page 6: How is python used in biomolecular sciences? · pdb api Schrödinger suite RDkit OpenMM tools Amber tools Python […] pdb2gmx mostly C++ commandline TCL, bash, python, Perl … get](https://reader033.fdocuments.us/reader033/viewer/2022060807/608c3e88b51d92504d1ed5e6/html5/thumbnails/6.jpg)
What is MD?
�6
+
box
+
integrator+
Physics engine usually written in C++, to integrate Newtons equation of motion using leap frog type algorithms.
hAiensemble = hAitime
![Page 7: How is python used in biomolecular sciences? · pdb api Schrödinger suite RDkit OpenMM tools Amber tools Python […] pdb2gmx mostly C++ commandline TCL, bash, python, Perl … get](https://reader033.fdocuments.us/reader033/viewer/2022060807/608c3e88b51d92504d1ed5e6/html5/thumbnails/7.jpg)
Typical MD workflow
�7
Prep for simulation
Run simulation protocol
Time-series analysis
Simulation setup
Gromacs
PyEMMA
Download pdb
Amber
NAMD
OpenMM
SOMD
DLPoly
Charmm
Python API
MDTraj
MDAnalysis
MDAnalysisMMTools
md-analysis-tools
[…]
pdb api
Schrödinger suite
RDkit
OpenMM toolsAmber tools
Python
[…]
pdb2gmx
mostly C++ commandline
TCL, bash, python, Perl … get creative
[…]GUI, TCL, bash, python, Perl … get creative
Cresset suite Commercial
![Page 8: How is python used in biomolecular sciences? · pdb api Schrödinger suite RDkit OpenMM tools Amber tools Python […] pdb2gmx mostly C++ commandline TCL, bash, python, Perl … get](https://reader033.fdocuments.us/reader033/viewer/2022060807/608c3e88b51d92504d1ed5e6/html5/thumbnails/8.jpg)
Typical file formats
�8
coordinates .pdb .crd .gro .xyz .mol2
trajectories .dcd .xtc .trr
forcefields .psf .parm7 .itp
And of course all MD programs can read all these file formats? — No
![Page 9: How is python used in biomolecular sciences? · pdb api Schrödinger suite RDkit OpenMM tools Amber tools Python […] pdb2gmx mostly C++ commandline TCL, bash, python, Perl … get](https://reader033.fdocuments.us/reader033/viewer/2022060807/608c3e88b51d92504d1ed5e6/html5/thumbnails/9.jpg)
Scenario: Simulate .gro file with Amber
�9
I have a coordinate .gro (Gromacs) file I want to simulate with Amber, how do I do this?
visualise coordinates
convert to pdb format
use a setup tool
run simulation
![Page 10: How is python used in biomolecular sciences? · pdb api Schrödinger suite RDkit OpenMM tools Amber tools Python […] pdb2gmx mostly C++ commandline TCL, bash, python, Perl … get](https://reader033.fdocuments.us/reader033/viewer/2022060807/608c3e88b51d92504d1ed5e6/html5/thumbnails/10.jpg)
Scenario: Simulate .gro file with Amber
�10
I have a coordinate .gro (Gromacs) file I want to simulate with Amber, how do I do this?
visualise coordinates
convert to pdb format
use a setup tool
run simulation
ppxasjsm::azuma { ~/Documents/}-> vmd dAla.gro
or, use one of the many other tools…
![Page 11: How is python used in biomolecular sciences? · pdb api Schrödinger suite RDkit OpenMM tools Amber tools Python […] pdb2gmx mostly C++ commandline TCL, bash, python, Perl … get](https://reader033.fdocuments.us/reader033/viewer/2022060807/608c3e88b51d92504d1ed5e6/html5/thumbnails/11.jpg)
Scenario: Simulate .gro file with Amber
�11
I have a coordinate .gro (Gromacs) file I want to simulate with Amber, how do I do this?
visualise coordinates
convert to pdb format
use a setup tool
run simulation
or
or ….
![Page 12: How is python used in biomolecular sciences? · pdb api Schrödinger suite RDkit OpenMM tools Amber tools Python […] pdb2gmx mostly C++ commandline TCL, bash, python, Perl … get](https://reader033.fdocuments.us/reader033/viewer/2022060807/608c3e88b51d92504d1ed5e6/html5/thumbnails/12.jpg)
Scenario: Simulate .gro file with Amber
�12
I have a coordinate .gro (Gromacs) file I want to simulate with Amber, how do I do this?
visualise coordinates
convert to pdb format
use a setup tool
run simulation
ppxasjsm::azuma { ~/Documents/}-> FESetup setup.in
pretty much all the arrows are bash scripts with input files
taken from AMBER tutorial
![Page 13: How is python used in biomolecular sciences? · pdb api Schrödinger suite RDkit OpenMM tools Amber tools Python […] pdb2gmx mostly C++ commandline TCL, bash, python, Perl … get](https://reader033.fdocuments.us/reader033/viewer/2022060807/608c3e88b51d92504d1ed5e6/html5/thumbnails/13.jpg)
Scenario: Simulate .gro file with Amber
�13
I have a coordinate .gro (Gromacs) file I want to simulate with Amber, how do I do this?
visualise coordinates
convert to pdb format
use a setup tool
run simulation
ppxasjsm::azuma { ~/Documents/}-> FESetup setup.in[globals] forcefield = amber, ff99SBildn, tip3p
[protein] basedir = protein file.name = protein.pdb molecules = ala
box.type = rectangular box.length = 10.0 align_axes = True neutralize = yes
min.nsteps = 1000 min.restr_force = 10.0 min.restraint = notsolvent
md.heat.nsteps = 1000 md.heat.restr_force = 10.0 md.heat.restraint = notsolvent
md.constT.nsteps = 1000 md.constT.restr_force = 10.0 md.constT.restraint = notsolvent
md.press.T = 298.0 md.press.nsteps = 50000 md.press.p = 1.0
![Page 14: How is python used in biomolecular sciences? · pdb api Schrödinger suite RDkit OpenMM tools Amber tools Python […] pdb2gmx mostly C++ commandline TCL, bash, python, Perl … get](https://reader033.fdocuments.us/reader033/viewer/2022060807/608c3e88b51d92504d1ed5e6/html5/thumbnails/14.jpg)
Scenario: Simulate .gro file with Amber
�14
I have a coordinate .gro (Gromacs) file I want to simulate with Amber, how do I do this?
visualise coordinates
convert to pdb format
use a setup tool
run simulation
![Page 15: How is python used in biomolecular sciences? · pdb api Schrödinger suite RDkit OpenMM tools Amber tools Python […] pdb2gmx mostly C++ commandline TCL, bash, python, Perl … get](https://reader033.fdocuments.us/reader033/viewer/2022060807/608c3e88b51d92504d1ed5e6/html5/thumbnails/15.jpg)
Scenario: Simulate .gro file with Amber
�15
I have a coordinate .gro (Gromacs) file I want to simulate with Amber, how do I do this?
visualise coordinates
convert to pdb format
use a setup tool
run simulation
pmemd -O -i ala.in -p ala.parm7 -c ala.rst7 -o myout.mdout -x myout.nc -e myout.mdene -r myout.rst7
And another command line tool…
![Page 16: How is python used in biomolecular sciences? · pdb api Schrödinger suite RDkit OpenMM tools Amber tools Python […] pdb2gmx mostly C++ commandline TCL, bash, python, Perl … get](https://reader033.fdocuments.us/reader033/viewer/2022060807/608c3e88b51d92504d1ed5e6/html5/thumbnails/16.jpg)
Zoo of applications leads to hacky workflows
�16
- Most tools have organically grown from academic software with poor software practices
- In order for tools to work with each other in complicated workflows a lot of hacky bash scripting is used by academic users
- Users need to be experts in many different software with command line interfaces to interlink them
- Many tools can do similar things and there may not be obvious solutions for one problem (google trap: try suggestions until one works)
-> Loss of focus on the science
![Page 17: How is python used in biomolecular sciences? · pdb api Schrödinger suite RDkit OpenMM tools Amber tools Python […] pdb2gmx mostly C++ commandline TCL, bash, python, Perl … get](https://reader033.fdocuments.us/reader033/viewer/2022060807/608c3e88b51d92504d1ed5e6/html5/thumbnails/17.jpg)
The same example as before BioSimSpace
�17
![Page 18: How is python used in biomolecular sciences? · pdb api Schrödinger suite RDkit OpenMM tools Amber tools Python […] pdb2gmx mostly C++ commandline TCL, bash, python, Perl … get](https://reader033.fdocuments.us/reader033/viewer/2022060807/608c3e88b51d92504d1ed5e6/html5/thumbnails/18.jpg)
Cloud server demo — using BioSimSpace
�18
http://130.61.69.221/
docker run chryswoods/biosimspace
http://130.61.69.221/hub/tmplogin
![Page 19: How is python used in biomolecular sciences? · pdb api Schrödinger suite RDkit OpenMM tools Amber tools Python […] pdb2gmx mostly C++ commandline TCL, bash, python, Perl … get](https://reader033.fdocuments.us/reader033/viewer/2022060807/608c3e88b51d92504d1ed5e6/html5/thumbnails/19.jpg)
BioSimSpace phase I
�19
Manual
FESetup
Sire/SOMD
Sire/analyse_freenrg based on pymbar
freenrgworkflows
ccpbiosim.org
siremol.org
Bio
Sim
Spac
e
L. Hedges
C. Woods
Commercially available software
Prep for simulation
Run simulation protocol
Time-series analysis
Simulation setup
Download pdb
some manual prep
![Page 20: How is python used in biomolecular sciences? · pdb api Schrödinger suite RDkit OpenMM tools Amber tools Python […] pdb2gmx mostly C++ commandline TCL, bash, python, Perl … get](https://reader033.fdocuments.us/reader033/viewer/2022060807/608c3e88b51d92504d1ed5e6/html5/thumbnails/20.jpg)
API overview
�20
MD
IO
Gateway
Process
Protocol
Trajectory
Core: Sire — Molecular library in C++
![Page 21: How is python used in biomolecular sciences? · pdb api Schrödinger suite RDkit OpenMM tools Amber tools Python […] pdb2gmx mostly C++ commandline TCL, bash, python, Perl … get](https://reader033.fdocuments.us/reader033/viewer/2022060807/608c3e88b51d92504d1ed5e6/html5/thumbnails/21.jpg)
What is BioSimSpace — summary
�21
Bio
Sim
Spac
e
Simulation Analysis
SoftwareMSM Perturbation Map Reweighting …
pyemma gmx wham pymbar …
System SetupProtein Protein + ligand ligand in water ligand in organic solvent …
Softwarepdb2gmx tleap FESetup …
Interoperable tool for biomolecular simulations
MD MC Enhanced sampling …
Software
Amber Gromacs OpenMM HTMD Plumed …
Trajectory generation
https://github.com/michellab/BioSimSpace
![Page 22: How is python used in biomolecular sciences? · pdb api Schrödinger suite RDkit OpenMM tools Amber tools Python […] pdb2gmx mostly C++ commandline TCL, bash, python, Perl … get](https://reader033.fdocuments.us/reader033/viewer/2022060807/608c3e88b51d92504d1ed5e6/html5/thumbnails/22.jpg)
Conclusions
�22
Bio
Sim
Spac
e
- Python API to write workflow components for Biomolecular simulations
- Allows to focus on science and not software: No need to become an expert at different MD packages, setup or analysis tools
- Ease of use in the cloud, with scalable computing resources and future academic fool proof pricing model
- Planned support for Knime and CWL workflow managers
Bio
Sim
Spac
e - A workflow engine - A top down approach by trying to reinvent the wheel again - A ‘finished’ piece of software: It is very much in an alpha
development stage with a large list of features and capabilities to be added in the future
![Page 23: How is python used in biomolecular sciences? · pdb api Schrödinger suite RDkit OpenMM tools Amber tools Python […] pdb2gmx mostly C++ commandline TCL, bash, python, Perl … get](https://reader033.fdocuments.us/reader033/viewer/2022060807/608c3e88b51d92504d1ed5e6/html5/thumbnails/23.jpg)
Questions
�23