Opal: Simple Web Services Wrappers for Scientific Applications

26
Opal: Simple Web Services Wrappers for Scientific Opal: Simple Web Services Wrappers for Scientific Applications Applications Sriram Krishnan*, Brent Stearn, Karan Bhatia, Kim K. Baldridge, Wilfred W. Li, Peter Arzberger *[email protected] ICWS 2006 - Sept 21, 2006

description

The grid-based infrastructure enables large-scale scientific applications to be run on distributed resources and coupled in innovative ways. However, in practice, grid resources are not very easy to use for the end-users who have to learn how to generate security credentials, stage inputs and outputs, access grid-based schedulers, and install complex client software. There is an imminent need to provide transparent access to these resources so that the end-users are shielded from the complicated details, and free to concentrate on their domain science. Scientific applications wrapped as Web services alleviate some of these problems by hiding the complexities of the back-end security and computational infrastructure, only exposing a simple SOAP API that can be accessed programmatically by application-specific user interfaces. However, writing the application services that access grid resources can be quite complicated, especially if it has to be replicated for every application. In this presentation, we present Opal which is a toolkit for wrapping scientific applications as Web services in a matter of hours, providing features such as scheduling, standards-based grid security and data management in an easy-to-use and configurable manner

Transcript of Opal: Simple Web Services Wrappers for Scientific Applications

Page 1: Opal: Simple Web Services Wrappers for Scientific Applications

Opal: Simple Web Services Wrappers for ScientificOpal: Simple Web Services Wrappers for ScientificApplicationsApplications

Sriram Krishnan*, Brent Stearn, Karan Bhatia, Kim K. Baldridge,Wilfred W. Li, Peter Arzberger

*[email protected]

ICWS 2006 - Sept 21, 2006

Page 2: Opal: Simple Web Services Wrappers for Scientific Applications

Web PortalsRich Clients

Set of Biomedical Applications

NBCR Computational InfrastructureNBCR Computational Infrastructure

Resources

Telescience Portal

Computational Grid

Web Services

WorkflowMiddleware

PMV ADTVision ContinuityAPBSCommand

APBS ContinuityGtomo2TxBRAutodockGAMESS

QMView

Page 3: Opal: Simple Web Services Wrappers for Scientific Applications

RequirementsRequirements

• Enable access to scientific applications onGrid resources– Seamlessly via a number of user interfaces– Easily from the perspective of a scientific user

• Enable the creation of scientific workflows– Possibly with the use of commodity workflow

toolkits

Page 4: Opal: Simple Web Services Wrappers for Scientific Applications

Challenges for theChallenges for the Scientific UserScientific User

• Access to Grid resources is still verycomplicated– User account creation– Management of credentials– Installation and deployment of scientific

software– Interaction with Grid schedulers– Data management

Page 5: Opal: Simple Web Services Wrappers for Scientific Applications

Towards Services Oriented Architectures (SOA)Towards Services Oriented Architectures (SOA)

• Scientific applications wrapped as Web services– Provision of a SOAP API for programmatic access

• Clients interact with application Web services,instead of Grid resources– Used in practice by several scientific communities

• NBCR: http://nbcr.net• GEON: http://geongrid.org• GLEON: http://gleon.org/• CAMERA: http://camera.calit2.net

Page 6: Opal: Simple Web Services Wrappers for Scientific Applications

Talk OutlineTalk Outline

• Motivation for a Services OrientedArchitecture (SOA)

• Overall end-to-end architecture• The Opal Toolkit: Introduction & technical

details• Some use cases• Concluding remarks

Page 7: Opal: Simple Web Services Wrappers for Scientific Applications

Condor pool SGE Cluster PBS Cluster

Globus Globus Globus

Application Services Security Services (GAMA)

StateMgmt

Gemstone PMV/Vision Kepler

Architecture OverviewArchitecture Overview

Page 8: Opal: Simple Web Services Wrappers for Scientific Applications

Scientific SOA: BenefitsScientific SOA: Benefits

• Applications are installed once, and used by allauthorized users– No need to create accounts for all Grid users– Use of standards-based Grid security mechanisms

• Users are shielded from the complexities of Gridschedulers

• Data management for multiple concurrent job runsperformed automatically by the Web service

• State management and persistence for long running jobs• Accessibility via a multitude of clients

Page 9: Opal: Simple Web Services Wrappers for Scientific Applications

Possible ApproachesPossible Approaches

• Write application services by hand– Pros: More flexible implementations, stronger data

typing via custom XML schemas– Cons: Not generic, need to write one wrapper per

application• Use a Web services wrapper toolkit, such as

Opal– Pros: Generic, rapid deployment of new services– Cons: Less flexible implementation, weak data typing

due to use of generic XML schemas

Page 10: Opal: Simple Web Services Wrappers for Scientific Applications

The Opal Toolkit: OverviewThe Opal Toolkit: Overview

• Enables rapid deployment of scientificapplications as Web services (< 2 hours)

• Steps– Application writers create configuration file(s) for a

scientific application– Deploy the application as a Web service using Opal’s

simple deployment mechanism (via Apache Ant)– Users can now access this application as a Web

service via a unique URL

Page 11: Opal: Simple Web Services Wrappers for Scientific Applications

Opal ArchitectureOpal Architecture

Tomcat Container

Axis Engine

Opal WS Opal WS

Cluster/Grid Resources

ContainerProperties

ServiceConfig

Scheduler, Security,DatabaseSetups

Binary,Metadata,Arguments

Page 12: Opal: Simple Web Services Wrappers for Scientific Applications

ImplementationImplementation DetailsDetails

• Implemented as a regular Axis service– Application behavior specified by the service

configuration– Configuration passed as a parameter via the auto-

generated deployment descriptor (WSDD)• Possible to have multiple instances of the same

class for different applications– Distinguished by a unique URL for every application

• No need to generate sources or WSDL prior todeployment

Page 13: Opal: Simple Web Services Wrappers for Scientific Applications

Sample Container PropertiesSample Container Properties# the base URL for the tomcat installation # this is required since Java can't figure out the IP # address if there are multiple network interfacestomcat.url=http://ws.nbcr.net:8080

# database informationdatabase.use=falsedatabase.url=jdbc:postgresql://localhost/app_dbdatabase.user=<app_user>database.passwd=<app_passwd>

# globus informationglobus.use=trueglobus.gatekeeper=ws.nbcr.net:2119/jobmanager-sgeglobus.service_cert=/home/apbs_user/certs/apbs_service.cert.pemglobus.service_privkey=/home/apbs_user/certs/apbs_service.privkey

# parallel parametersnum.procs=16mpi.run=/opt/mpich/gnu/bin/mpirun

Page 14: Opal: Simple Web Services Wrappers for Scientific Applications

Sample Application ConfigurationSample Application Configuration

<appConfig xmlns="http://nbcr.sdsc.edu/opal/types" xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <metadata> <usage><![CDATA[psize.py [opts] <filename>]]></usage> <info xsd:type="xsd:string"> <![CDATA[ --help : Display this text --CFAC=<value> : Factor by which to expand mol dims to get coarse grid dims [default = 1.7] ... ]]> </info> </metadata> <binaryLocation>/homes/apbs_user/bin/psize.py</binaryLocation> <defaultArgs>--GMEMCEIL=1000</defaultArgs> <parallel>false</parallel></appConfig>

Page 15: Opal: Simple Web Services Wrappers for Scientific Applications

Application Deployment & UndeploymentApplication Deployment & Undeployment

• To deploy onto a local Tomcat container:ant -f build-opal.xml deploy -DserviceName=<serviceName>-DappConfig=<appConfig.xml>

• To undeploy a service:

ant -f build-opal.xml undeploy -DserviceName=<serviceName>

Page 16: Opal: Simple Web Services Wrappers for Scientific Applications

Service OperationsService Operations

• Get application metadata: Returns metadata specifiedinside the application configuration

• Launch job: Accepts list of arguments and input files(Base64 encoded), launches the job, and returns a jobID

• Query job status: Returns status of running job using thejobID

• Get job outputs: Returns the locations of job outputsusing the jobID

• Get output as Base64: Returns an output file in Base64encoded form

• Destroy job: Uses the jobID to destroy a running job

Page 17: Opal: Simple Web Services Wrappers for Scientific Applications

MEME+MAST WorkflowMEME+MAST Workflow using Keplerusing Kepler

Page 18: Opal: Simple Web Services Wrappers for Scientific Applications

Kepler Opal Web Services ActorKepler Opal Web Services Actor

Page 19: Opal: Simple Web Services Wrappers for Scientific Applications

Ligand Protein Interaction Using Web ServicesLigand Protein Interaction Using Web Services

• Baldridge, Greenberg, Amoreira, Kondric• GAMESS Service

– More accurate Ligand Information• LigPrep Service

– Generation of Conformational Spaces• PDB2PQR Service

– Protein preparation• APBS Service

– Generation of electrostatic information• QMView Service

– Visualization of electrostatic potential file• Applications:

– Electrostatics and docking– High-throughput processing of ligand-

protein interaction studies– Use of small molecules (ligands) to turn

on or off a protein function

Page 20: Opal: Simple Web Services Wrappers for Scientific Applications

ConclusionsConclusions

• We presented Opal, a toolkit for rapidlyexposing legacy scientific applications asWeb services– Provides features like Job management, Scheduling,

Security, and Persistence, in an easy to use andconfigurable manner

– Lowers inertia in the scientific community against theadoption of Web services and SOAs

– Currently being used by the NBCR community forGrid-enabling several scientific applications, via amultitude of interfaces

Page 21: Opal: Simple Web Services Wrappers for Scientific Applications

Future WorkFuture Work

• WSRF Integration– State management using standard Grid mechanisms– Asynchronous status notifications via WS-Notification

• Meta-scheduling and resource monitoring– Use of CSF4 and GFarm

• Alternate mechanisms for I/O staging– GridFTP, RFT

• Addition of strong data typing for I/O– Use of XML schemas, DFDL

Page 22: Opal: Simple Web Services Wrappers for Scientific Applications

Thanks for listening!Thanks for listening!

• More information, downloads, documentation:– http://nbcr.net/services/

• Questions and comments welcome!– [email protected]

Page 23: Opal: Simple Web Services Wrappers for Scientific Applications

AppendixAppendix

Page 24: Opal: Simple Web Services Wrappers for Scientific Applications

Grid Computing is Grid Computing is ……

• “Co-ordinated resource sharing and problem solving in dynamicmulti-institutional virtual organizations.” [Foster, Kesselman,Tuecke]

– Co-ordinated - multiple resources working in concert, eg. Disk & CPU,or instruments & database, etc.

– Resources - compute cycles, databases, files, application services,instruments.

– Problem solving - focus on solving scientific problems– Dynamic - environments that are changing in unpredictable ways– Virtual Organization - resources spanning multiple organizations and

administrative domains, security domains, and technical domains

Page 25: Opal: Simple Web Services Wrappers for Scientific Applications

GridGrid Computing is Computing is …… (Industry)(Industry)

• “About finding distributed, underutilized compute resources (systems,desktops, storage) and provisioning those resources to users orapplications requiring them.” [The Grid Report, Clabby Analytics]

– Distributed - all the resources laying around in departments or server rooms.– Underutilized - organizations save money by increasing utilization versus

purchasing new resources.– Resources - servers and server cycles, applications, data resources– Provisioning - predict and schedule resource use depending on load.

Page 26: Opal: Simple Web Services Wrappers for Scientific Applications

Scientific Objective: Modeling and Analysis Across ScalesScientific Objective: Modeling and Analysis Across Scales

Tools that Integrate Data, Construct Models and Perform Analysis across Scales

Organisms

Macromolecular

Molecule

Subcellular

Atom

Cell

Tissue

Organ

Modeling Synaptic Activity

crossbridge

lattice

multicellular

filament

ventricles

Modeling the Heart