Printing - GPU Technology Conferenceon-demand.gputechconf.com/gtc/2015/posters/GTC_2015_Life... ·...

Printing: This poster is 48” wide by 36” high. It’s designed to be printed on a large-format printer.

Customizing the Content: The placeholders in this poster are formatted for you. Type in the placeholders to add text, or click an icon to add a table, chart, SmartArt graphic, picture or multimedia file. To add or remove bullet points from text, just click the Bullets button on the Home tab. If you need more placeholders for titles, content or body text, just make a copy of what you need and drag it into place. PowerPoint’s Smart Guides will help you align it with everything else. Want to use your own pictures instead of ours? No problem! Just right-click a picture and choose Change Picture. Maintain the proportion of pictures as you resize by dragging a corner.

10 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

Similarity to native

Pseu

do

ener

gy

Near native poses

∆E

Decoy poses

GeauxDock: an ultra-fast molecular docking package for computer-‐aided drug discovery Yun Ding, Ye Fang, W. Feinstein, D.M. Koppelman, J. Moreno, J. Ramanujam, M.Brylinski, M. Jarrell Louisiana State University, Louisiana Alliance for SimulaDon-‐Guided Materials ApplicaDons

BACKGROUND ComputaDonal modeling of binding drug to proteins has become an integral component of modern drug discovery pipelines. A typical applicaDon is structure-‐based virtual screening, which involves a large-‐scale modeling of pharmacological relevant associaDons between small molecules and their macromolecular targets (Fig. 1 and Fig. 2). The desire to improve state-‐of-‐the-‐art moDvated us to develop an ultra-‐fast ligand docking approach that uses Monte Carlo as the sampling method and features computaDons on modern supercomputers. Combined with an effecDve scoring funcDon, this new method will provide accurate predicDons at a high performance/cost raDo, which is a criDcal factor for large-‐scale virtual screening applicaDons.

SCORING FUNCTION

Figure 1. Computer-‐aided drug development holds a significant promise to speed up the discovery of novel pharmaceuDcals at reduced costs.

The naDve-‐like recogniDon capacity of the scoring funcDon was opDmized by maximizing the Z-‐score calculated across the training dataset. The Z-‐score is defined as:

CASE STUDY

One example of successful pose idenDficaDon is shown to demonstrate the accuracy of our approach; GeauxDock was used to find the near-‐naDve pose of the inhibitor in Cathepsin K -‐ P43235.

BENCHMARKS

IMPLEMENTATIONS

CONCLUSIONS

Figure 6. Performance comparison on different plaôrms. The benchmarks are carried on four plaôrms, a 6 core Xeon E5-‐2620 CPU, a Xeon Phi 3210A, two Tesla M2090, and a GeForce GTX 980. We use intel compiler 15.0.0 and CUDA 6.5 with opDmizaDon flag –O3. The results demonstrate a considerable speedup using heterogeneous accelerators. Our early tuning of a Maxwell GeForce GTX 980 GPU provides a two-‐fold speed up over a Fermi Tesla M2090 GPU.

Figure 2. Docking simulaDons predict the naDve pose of the ligand by searching for the global minimum in the energy space.

Z − score = EL −EH

σ L2 +σ H

2

Figure 3. The naDve-‐like recogniDon capability of our force field is opDmized by maximizing the Z-‐score.

F i g u r e 4 . R e c e i v e r o p e r a D n g characterisDc (ROC) analysis for the recogniDon of naDve-‐like conformaDons in docking ensembles by GeauxDock compared to other scoring funcDons. TPR – true posiDve rate, FPR – false posiDve rate.

IniDalize

Monte Carlo

Perturb

compute

Replica Exchange

Monte Carlo IteraDons ~=10

…

Monte Carlo

Perturb

compute

Monte Carlo

Perturb

compute

The docking code is implemented with a highly efficient Replica Exchange Monte Carlo (REMC) sampling to take advantage of massively parallel capabiliDes provided by GPUs. In a nutshell, docking calculaDons exploits two levels of parallelism: coarse-‐ and fine-‐grained. The former considers a simultaneous processing of many ligand configuraDons, which will be mapped to hardware thread blocks. The fine-‐grained level parallelizes the calculaDon of interacDons for each individual configuraDon using mulDple threads within each block.

We implement and maintain fairly similar codes for three plaôrms: CPU-‐OpenMP, Phi-‐OpenMP, GPU-‐CUDA. The applicaDon front end loads data from input files, then apply strength reducDon opDmizaDon and construct Struct of Array (SoA) data for GPU, or Array of Structure (AoS) data for CPU/Phi. A set of common wrappers for offload compuDng are developed to abstract the work flow on three different plaôrms: (1) data allocaDon, (2) copy-‐in, (3) kernel computaDon, and (4) copy-‐back, thus that CPU/Phi/GPU code can share the same high level infrastructure. While the low level kernel code for different architecture are forked from a highly opDmized sequenDal code, and specifically tuned for the parallel performance.

GeauxDock provides sufficient accuracy for its applicaDon in large-‐scale virtual screening projects to support the reconstrucDon of protein-‐ligand interacDon networks. GeauxDock adapts mulDple commodity back-‐ends, among which, GPU provides considerable speedup. GeauxDock is freely available from our website: hmp://www.insDtute.loni.org/lasigma/package/dock/ This work is supported by the NSF EPSCoR LA-‐SiGMA project under award #EPS-‐1003897 and the Louisiana Board of Regents through the Board of Regents Support Fund [contract LEQSF(2012-‐15)-‐RD-‐A-‐05].

Figure 5. Docking simulaDon for Cathepsin K -‐ P43235. From up let to down right. (A) Decreasing pseudo energy drives system to higher CMS regions during the Monte Carlo simulaDon. (B) Scamer plot of pseudo energy vs CMS. Three typical data points were selected from the simulaDon trajectory and their corresponding configuraDons (same color) are plomed in (C). (D) Close-‐ups. The crystallographic configuraDon is light blue.

Decoy

Intermediate

native-like

A

DC

B

Figure 8. The mapping from computaDonal resources to domain science objects. A protein-‐ligand replica is mapped to a GPU thread block; atom coefficients are stored in consecuDve memory addresses, and mapped to GPU threads.

Figure 7. The program’s flow chart. Many replicas are updated simultaneously using replica exchange Monte Carlo (REMC) algorithm.

contact name

Yun Ding: [email protected]

P5191

category: Life & MateriaL Science - LS04

Printing - GPU Technology Conferenceon-demand.gputechconf.com/gtc/2015/posters/GTC_2015_Life... ·...

Documents

Transcript of Printing - GPU Technology Conferenceon-demand.gputechconf.com/gtc/2015/posters/GTC_2015_Life... ·...