Post on 04-Jan-2016
Supporting Molecular Simulation-based Bio/Nano Research
on Computational GRIDs
Karpjoo Jeong (jeongk@konkuk.ac.kr), Konkuk Univ.
Suntae Hwang (sthwang@kookmin.ac.kr), Kookmin Univ.
2
CollaborationCollaboration
IT– Karpjoo Jeong at Konkuk University– Suntae Hwang at Kookmin University– Younghwan Park at Hansung University
BT/NT– Seunho Jung at Konkuk University– Yoongho Im at Konkuk University
3
ContentsContents
Molecular Simulation-based BioNano Research
How to Build Cheap but Powerful Supercomputer
How to Manage Lots of Simulation Results from Supercomputers
Molecular Simulation System on World-wide Computational GRIDs
Implementation Status and Preliminary Performance Results
Conclusions and Future Work
4
Molecular Simulation-based Molecular Simulation-based BioBioNano ResearchNano Research
5
CharacteristicsCharacteristics
Requirements for very large computation Complicated research process in a
workflow style– Consist of Modeling, Simulation, Verification
Tasks which form a complex workflow Credibility of simulation tool is crucial
– A few well-known software packages are only accepted
Lots of repetition of same simulation and application to similar problems– But with different parameters
6
Characteristics (Cont’d)Characteristics (Cont’d)
Good News– Lots of parallelism in research tasks– No need for writing complicated simulation
code (in most cases) Bad News
– Frequent scientists’ intervention is required Verify intermediate results Guide simulation directions
– Single task (single instance of simulation execution) alone may be very large
– Parallel simulation is extremely difficult
7
Potentials for GRIDPotentials for GRID
Inter-Task Independence and Parallelism Problem-level Parallelism
– Most coarse-grained– Solving similar problems by similar methods
Simulation-level Parallelism– Coarse-grained– Repetition of same simulation but with
different parameters
8
How to Build Cheap and How to Build Cheap and Powerful SupercomputerPowerful Supercomputer
9
PC Lab-based Virtual Parallel Computers
Goals– Utilize idle computing resources at many PC
labs, universities around the world Hundreds or thousands of PCs at each university
which are almost 100% idle at night Relatively less sensitive to security issues
– Build these PCs into virtual parallel computers a thousand of Pentium4 2.0GHz CPUs can match
very expensive supercomputers
– Apply these parallel computers for coarse-grained parallel problems such as molecular simulation-based bio/nano research problems
10
Vision: World-wide Computing
Night-time workDay-time Work
Migration
Project1
Prject2
MolecularSimulationMolecular
SimulationMolecularSimulationMolecular
Simulation
MolecularSimulationMolecular
SimulationMolecularSimulationMolecular
Simulation
Project1
Prject2
MolecularSimulationMolecular
SimulationMolecularSimulationMolecular
Simulation
MolecularSimulationMolecular
SimulationMolecularSimulationMolecular
Simulation
11
Base Computing SystemBase Computing System
Persistent Linda Parallel/Distributed System Linda Parallel Programming Model
– Shared Memory Model in Mailbox Style– Ease of programming
Heterogeneity Support (Ex, Linux 및 MS Windows)– Process Migration– Parallel Computation Migration
Fault Tolerance IDLE PC Utilization Efficient Support for Coarse-grained Parallel
Computation
12
PrototypePrototype
PC Lab at College of Information and Communication, Konkuk University, Seoul, Korea
50 Pentium4 PCs Linux cluster with 5 nodes
13
Web Monitoring InterfaceWeb Monitoring Interface
14
How to Manage Lots of How to Manage Lots of Simulation Results from Simulation Results from
SupercomputersSupercomputers
15
Workflow ApproachWorkflow Approach
Provide workflow-based simulation environment– Allow scientists to plan research processes
in a workflow style – Manage intermediate results and notify
scientists of next tasks– Execute independent tasks in parallel
Scientists can avoid tedious management overheads and focus on planning, analysis and verification work
16
GRID-based Molecular GRID-based Molecular Simulation EnvironmentSimulation Environment
Workflow-based simulation environment submits simulation tasks to computational GRIDs in a user-transparent way
Computational GRIDs
• Shared computing resources• CHARMM, AMBER Tasks• Numerous independent tasks
Computational GRIDs
• Shared computing resources• CHARMM, AMBER Tasks• Numerous independent tasks
Workflow-basedSimulation Environment
Project1MolecularSimulationMolecularSimulationMolecular
SimulationMolecularSimulationPrject2
MolecularSimulationMolecularSimulationMolecularSimulationMolecularSimulation
17
Computational GRIDs
• Shared computing resources• CHARMM, AMBER Tasks• Numerous independent tasks
Computational GRIDs
• Shared computing resources• CHARMM, AMBER Tasks• Numerous independent tasksWorkflow-based
Simulation Environment
Project1MolecularSimulationMolecular
SimulationMolecularSimulationMolecular
SimulationPrject2MolecularSimulationMolecular
SimulationMolecularSimulationMolecular
Simulation Workflow-basedSimulation Environment
Project1MolecularSimulationMolecular
SimulationMolecularSimulationMolecular
SimulationPrject2MolecularSimulationMolecular
SimulationMolecularSimulationMolecular
Simulation
Workflow-basedSimulation Environment
Project1MolecularSimulationMolecular
SimulationMolecularSimulationMolecular
SimulationPrject2MolecularSimulationMolecular
SimulationMolecularSimulationMolecular
Simulation
18
Molecular Simulation System onMolecular Simulation System onWorld-wide Computation GRIDsWorld-wide Computation GRIDs((Persistent Linda and Globus)Persistent Linda and Globus)
19
Persistent Linda • Shared computing resources• CHARMM, AMBER Tasks• Numerous independent tasks
Persistent Linda • Shared computing resources• CHARMM, AMBER Tasks• Numerous independent tasks
Gateway Agent
GRAM
GridFTP
Ex. Konkuk Univ.
Ex. KISTI
Data/ResultFiles
TaskRequest
MolecularSimulation
GLOBUSWorkflow-based
Simulation Environment
Project1MolecularSimulationMolecular
SimulationMolecularSimulationMolecular
SimulationPrject2MolecularSimulationMolecular
SimulationMolecularSimulationMolecular
Simulation
20
Fault Tolerance and MigrationFault Tolerance and Migration
Simulation packages such as CHARM and GAUSSIAN support checkpointing facilities– Save computation status to disk and resume
computation from it later Our Molecular Simulation System is
designed to use these facilities to deal with fault tolerance and migration
Checkpointing is a solution to long-running simulation
21
PersistentLinda
PersistentLinda
Workflow-basedSimulation Environment
Project1MolecularSimulationMolecular
SimulationMolecularSimulationMolecular
SimulationPrject2MolecularSimulationMolecular
SimulationMolecularSimulationMolecular
Simulation
Workflow-basedSimulation Environment
Project1MolecularSimulationMolecular
SimulationMolecularSimulationMolecular
SimulationPrject2MolecularSimulationMolecular
SimulationMolecularSimulationMolecular
Simulation
PersistentLinda
PersistentLinda
GLOBUSGLOBUS
Workflow-basedSimulation Environment
Project1MolecularSimulationMolecular
SimulationMolecularSimulationMolecular
SimulationPrject2MolecularSimulationMolecular
SimulationMolecularSimulationMolecular
Simulation
22
Implementation Status and Implementation Status and Preliminary Performance ResultsPreliminary Performance Results
23
Implementation StatusImplementation Status
Persistent Linda System– Used for various parallel applications for
years– Recently ported to MS Windows
Workflow Molecular Simulation System– Implementation of prototype is underway
Globus-based global coordination middleware– Gateway between Persistent Linda and
Globus is implemented– Global scheduler is being designed
24
ExperimentsExperiments Settings
– Simple client on remote host– Persistent Linda System– Three CHARM driver programs on three Linux
servers (P4 2.0Ghz) which invoke CHARM Scenario: remote invocation of single CHARM task Result
– about 30 seconds for remote invocation overhead
Persistent Linda
Persistent Linda
CHARMMDriver
Linux ServerCHARMMDriver
Linux ServerCHARMM
Driver
Linux Server
GRAM
GridFTP
Gatewayagent
CHARMMCHARMM
agentcharmm job
charmm result
Remote Site
25
Conclusions and Future WorkConclusions and Future Work
Propose molecular simulation system on computational GRIDs– Utilize idle PCs at university Labs– Workflow-based simulation environment
Effective for coarse-grained parallel problems such as molecular simulation-based bio/nano research
Developing Globus-based global middleware Planning on
– Large scale computational GRIDs by combining several university labs
– Application for bio/nano research Database for chiral molecules