The Gateway Computational Web Portal Marlon Pierce, Choonhan Youn, Geoffrey Fox ERDC, August 16...

Post on 25-Dec-2015

214 views 0 download

Tags:

Transcript of The Gateway Computational Web Portal Marlon Pierce, Choonhan Youn, Geoffrey Fox ERDC, August 16...

The Gateway Computational Web Portal

Marlon Pierce, Choonhan Youn, Geoffrey Fox

ERDC, August 16 2001

Tutorial Overview

• Demo• Grid and Gateway Overviews• HTML Forms• Java QuickStart Guide• JavaServer Pages Overview• Gateway JSP Tools• WebFlow Module Development• Installation and Security Issues

Computational Grids Survey

A brief introduction to computational grid projects and goals.

What Is a Computational Grid?

• Grids link distributed scientific resources.– Resources can be geographically, politically distributed

• Goal: provide means for sharing resources between organizations.

• Example “high-end” resources:– Supercomputers and clusters– Mass storage– Advanced visualization (CAVES) and collaboration (Access

Grid).– Particle colliders, telescopes, earthquake detectors

• www.globus.org/research/papers/anatomy.pdf

What Does a Grid Need?

• Multi-institutional security – PKI or Kerberos

• Information services– Manage, store, deliver information about resources.– Use information to make decisions

• Scheduling and Queuing– Advance reservation– Meta-queuing

• Remote execution, file transfer, monitoring

Example of a Grid Problem:CERN’s Large Hadron Collider

• Goes on-line in 2005• Will generate petabytes of raw, distributed

data, terabytes of event summary data.• Computing resources for data analysis will

be distributed between CERN and regional centers spread all over the world

• 1500-2000 people will collaborate on experiments.

Grid Projects

• Grid Infrastructure– Condor: www.cs.wisc.edu– Globus: www.globus.org – Legion: www.cs.virginia.edu/~legion

• Grid Applications– Netsolve: www.cs.utk.edu/netsolve – Ninf: www.etl.go.jp

• Global Grid Forum: www.gridforum.org

Examples of Deployed Grids

• NASA’s Information Power Grid– Links NASA’s Ames, Glenn, and Langley Centers.

– LaunchPad currently available

– www.ipg.nasa.gov

• DOE’s ASCI Distributed Resource Management– Links classified computing resources at Lawrence

Livermore, Los Alamos, and Sandia National Labs.

– Full deployment scheduled by Nov 2001.

Latest Grid News

• NSF will spend $53 million on the Distributed Terascale Facility (DTF)– 13.6 teraflops, 600 terabytes, 40 Gigabit/sec– DTF sites: NCSA, SDSC, Argonne, CalTech– Industry partners: IBM, Intel, Qwest

• See www.ncsa.uiuc.edu/News/Access/Releases for more information (August 9).

Example: Globus

• Run applications remotely:– globus-job-run: interactive.– globus-job-submit: batch for PBS, LSF.– globusrun: most general version (RSL).

• Split jobs between hosts.• Send and retrieve data securely (PKI).• Monitor jobs remotely.• Monitor hosts remotely.

What’s the Problem?

• Globus client must be installed on desktop– Difficult installation

– No ubiquitous access (PDAs, your grandmother’s PC)

• Typical solution is to support Globus at particular sites and have users remotely log in.– Problems arise because many users are not Unix-savvy.

• Lots of new commands to learn.

Computational Portals

• Computational portals are designed to simplify access to grid technologies.

• Also provide coarse-grained grid approach that ties grid and non-grid resources.– Not everyone uses Technology X.– Not everything at a TechX supporting site will

use TechX.– Different TechX sites may remain separate.

Gateway Architecture

• Gateway is implemented in a three-tiered architecture.

• Browser Front End– JSP dynamically generates HTML pages.

• Component Middle Tier– JavaBeans on the web server.– Distributed WebFlow servers.

• HPC back end– Link to grid and non-grid services with rsh, ssh.– More sophisticated interfaces can be built.

WebFlow Master Server

WebFlow Child Server

WebFlow Child Server

WebFlow Child Server

WebFlow Child Server

Web ServerAnd

Servlet Engine

JavaBeanServiceProxy

JavaBeanLocal

Service

JavaBeanServiceProxy

JavaBeanLocal

Service

SECIOP

SECIOP

Web BrowserAnd

Client Applications

JVM

Web BrowserAnd

Client Applications

HTTP(S) HTTP(S)

Data Storage

Condor Flock

HPC+PBS HPC+LSF

Globus Grid

RS

H,S

SH

RS

H,S

SH

Gateway Design Goals

• Build a working portal for users.• Produce a tool chest for portal developers.• Targeted Services:

– File Transfer– Problem organization and session archiving– Batch script generation– Job submission– Job monitoring– Shared visualization– Security

Levels of Use

• Users and Admins can do everything through web.

• Portal developers may want to edit pages, use our components.

• Advanced developers can write modules.

Portal Users andAdministrators

Portal Developers

ModuleDevelopers

Sophistication

Gateway Descriptors

How to add your codes and your hosts to the portal.

Gateway Descriptors

• Form the base of portal for any particular field.

• Collect static info about applications, hosts in an XML data record.

• Application Descriptors describe how to run codes.

• Host descriptors describe HPC systems.• Users are described by another mechanism.

Sample Application Descriptor

<XSIL Name="ANSYS" Type="csm.parseXMLDesc">

<Param Name="NumberOfInParams">0</Param>

<Param Name="NumberOfInFiles">1</Param>

<Param Name="NumberOfOutParams">0</Param>

<Param Name="NumberOfOutFiles">1</Param>

<Param Name="IOStyle">StandardIO</Param>

Sample Host Descriptor <XSIL Name="Modi4" Type="csm.parseXMLHost">

<Param Name="HostName">modi4</Param>

<Param Name="QueueType">LSF</Param>

<Param Name="ExecPath">/usr/bin/ansys57</Param>

<Param Name="WorkDir">/scratch</Param>

<Param Name="QsubPath">/usr/bin/bsub</Param>

Adding Your Application

• We store application and host data in a single file.– Applications “contain” hosts.

• You can create and edit this by hand, or• You can use administrator interface to edit

the data record.• Admin interface also lets your verify data.

– Did I give the right executable path.

Java Quick Start Guide

A quick and dirty overview of the Java programming language.

Basic Elements

• The Java language resembles C/C++:– Primitive types: int,float, double, char, boolean

– Strings are actually classes (more later on this)

– Standard control structures like for and while loops, if statements, case/switch statements, try/catch blocks.

• Some important differences from C/C++:– No pointers

– Method arguments are always passed by value.

– No preprocessors or macros.

If/Else Statement Format

if(condition1) {//conditionally executed code

}else if(condition2) {//conditionally executed code

}else {//conditionally executed code

}

For Loops

• Syntax:

for(int i=0;i<MAX;i++) {//executed code

}• MAX is a variable defined elsewhere.

Java Classes

• Java is object-oriented– Classes encapsulates data and methods (functions)

within a single entity.– Objects are instances of classes.

• Analogy: the declaration “int i” creates an instance of an integer.

• The Java SDK comes with an extensive library of pre-defined classes for you to use.

• See the online API:– http://java.sun.com/j2se/1.3/docs/api/

Example Class: Hashtable

• Hashtable allows you to store name/value pairs.

• To create a new hashtable object:Hashtable myhash=new Hashtable();

• You can now use Hashtable methodsmyhash.put(“MyName”,”Marlon”);String name= (String)myhash.get(“MyName”);

References

• The Java web site has API documentation and tutorials:– http://java.sun.com/j2se/1.3/docs/

• Excellent reference text:– “Core Java” Volumes I and II by Cay

Horstmann and Gary Cornell (Prentice Hall)

• O’Reilly publishes the API:– “Java in a Nutshell” by David Flanagan

Interactive HTML

Using HTML forms to tie widgets to server actions.

The <form> Tag

• The <form> …</form> tag pair surround all HTML input types.

• Format:<form name=“myform” method=“Post”

action=“/GOW/servlet/someAction”>… <!-- Input tags go here --></form>

• The “action” attribute specifies what happens when an input button is pressed.– Can be CGI, a servlet, or a JSP page

The <input> Tag

• Input tags define text fields, submit buttons, radio buttons, menus, …

• Several can be combined within a single <form>

• Format:<form method=“GET” action=“servlet/myServlet”><input type=“text” name=“text” value=“Sample”><input type=“submit”></form>

Putting It All Together

<html><body><form action=“someaction”

method=“Post”>Please type your name:<input type=“text”

name=“myname” value=“Marlon”>

<input type=“submit”></form></body></html>

test.html

What Happens When I Click the Button?

• The CGI Script/Servlet/JSP/… specified in the action receives all name/value fields of <input>.

• These are sent in the HTTP request to the server.• Usually you should use “Post” instead of “Get”

– No size limit on requests with Post

– Requests are not shown in the browser location field.

• The server-side code usually returns output to the browser.

JavaServer Pages (JSP)

Putting Java into HTML to build dynamic web pages.

What Are JavaServer Pages?

• JSP let you embed Java code into your HTML web pages. – Use .jsp extension

• When the page is loaded by the browser, JSP is translated into a servlet, executed, and you see the output.

• To run this, you need a special server– Apache’s Tomcat, IBM’s WebSphere, …

Embedding Scriptlets

<%@ page import=“java.util.Date” %><%@ page import=“java.util.Hashtable”%><html><body>Hello, Marlon <%Hashtable Myhash=new Hashtable();Date now=new Date();Myhash.put(“date”,now);%>The time is <%= now.toString() %><!-- More HTML and scriptlets to follow. --></body></html>

What Does It Mean?

• The “import” statements at the top point to the location of the Java class files.

• Everything between <% and %> is interpreted as Java.– This sections are called scriptlets.

• Java can be in-lined with html using the <%= %> tags.– These are referred to as expressions.

Ex: For Loop of Radio Buttons

<% for (int i=0;i<5;i++) {

%>Radio<%= i %> <input type=“radio”>

<%}

%>

Using JavaBeans in JSP

• As presented so far, JSP still requires extensive knowledge of Java API.– O’Reilly’s Java API “Nutshell” is 600 pages.

• JavaBeans are custom components that encapsulate specific sets of functions. – Develop a small set of classes for area-specific tasks.

• It is good design to separate display from control code so that each is reusable.– You don’t want sprawling JSP pages.

Separation of Responsibility

Portal UsersDefine

Functionality,L&F

Web Developers

Work on L&F

Java Programmers

Develop Beans

JavaBeans in JSP

• Create an instance “gem” of the gemBean class <jsp:useBean id=“gem” class=“gem.gemBean” scope=“session”/>

• You can now use “gem” like any other objectgem.loadData();gem.runSim();

• By setting the scope, pages can share beans.• You can also use this tag to initialize once• Other HTML-like tags exist for accessing data.

Overview of Gateway JSPs

• Welcome Page– Sets up most beans– Buttons are included in TrackNavigator.jsp

CodeSelect.jsp

• Codes are read in from the Application Descriptor file.

• The page is generated automatically from the descriptor.

• Problem name is mapped to user context directory, where session data will be stored.

JobSubmit.jsp

• Based on selected code, forms are generated automatically.

• Application Descriptor file specifies the number of input files, parameters, etc.

Submitted.jsp

• Shows the generated queue script, based on user requests.

• The user has one last chance to edit.

• The “Submit” button can be tied to an action to run the script.

ReturnPage.jsp

• Job has been submitted.

• The track navigator is again included at the bottom of the page.

Gateway Bean Classes

An overview of the Bean classes that can be used to build portals.

Gateway Architecture

• We have developed a number of service beans for computational portals.

• Some accomplish specific tasks on server.

• Others act as proxies to WebFlow modules (next section).

Context Data

• Gateway organizes user sessions into “problems” and “sessions.” A problem contains one or more sessions.

• All of this is called Context data. It maps to a directory on the server.

• All information gathered from the user is stored as name value pairs in the appropriate subdirectory.

ContextManagerBean

• Contains convenience methods for finding old problems and sessions, creating new ones, deleting old ones, etc.

• Common Methods: too many to list. Come to the lab or see the documentation– www.gatewayportal.org/DOC/index.html

moduleServerBean

• Hides the messy details of connecting to WebFlow and getting an instance of the module you want.

• Creates instances of all WebFlow modules, provides accessor methods for them.

• So to get the submitJob module, I just usesubmitJob sj=modserver.getSubmitJob();

in my JSP page.

parseXMLBean

• Parses the Application Descriptor data record.

• Provides specific getters for hosts, applications.

• Provides general getters for other parameters:– getCodeTagValue(“ANSYS”,”IOStyle”);

createScript

• This is an abstract superclass of script generators. – Extend it with

createPBS.java, createLSF.java, createCSH.java, etc.

• Actual class created at runtime with scriptFactory.java.

setPropBean

• JSPs communicate by sending HTTP requests to each other.

• We many name/value pairs to write to the Context data directory.

• setPropBean provides automating methods to remove drudgery and cut out page bulk.

• Other JSPs can recover data using ContextManager.

JSP

setPropBean

ContextData

ContextManager

HTTP Requests

Miscellaneous Beans

• jobInfoBean: convenient wrap around hashtable for storing name/value strings.

• nameEncodeBean: inserts/removes underscores in problem names. Used to create unix directory names.

• GetFileBean: reads/writes script files to disk, filters out control characters.

Page Control

• Page flow is controlled by the servlet GOWAdminServlet.java. – Pages call this servlet,

which invokes the next page.

• The servlet receives the request from page A, looks up the next page is display, and shows it.

Commands

• Commands are classes that implement a simple “Command” interface.– Must override the execute() method.

• ForwardCommand: Simplest case. Just forwards control to the specified page.

• SubmitCommand: Assembles and executes a remote command to run a job before displaying the next page.

WebFlow Modules

An overview of how to use existing modules and how to write your own.

The Role of WebFlow

• WebFlow servers can distribute portal services over many hosts.

• WebFlow can do this because it is hierarchical:– Single parent acts as gatekeeper for child

servers.– Ex: Run main server at FSU, child server at

NCSA to provide access to remote file system.

WebFlow Design

• WebFlow is a custom-built component system.– Implements JavaBeans spec

using CORBA

• Servers contain “contexts” (abstract containers) and “modules”.– Contexts are organizational,

can be remote (i.e. child servers).

– Modules are CORBA implementation files.

Configuring WebFlow Servers

• WebFlow servers configured with text files.• Header:

– Name of server– File to write IOR if it is a master server– Parent– URL of IOR file (if child).

• List of provided modules follows:– Name– Location of interface (IDL or XML) – Java package name of module

Some Standard Modules

• submitJob: executes external local and remote commands (rsh, ssh), moves files to and from remote systems (rcp, scp).

• remotefile: moves files between client and server machines.

• ContextManager: can manage remote contexts. Uses two helper modules.

• Charon: http security module.

Using Modules

• Modules are just Java classes.– API on the web at

www.gatewayportal.org/DOC/index.html.

• Get instance in JSP page using moduleServerBean.– You can now invoke the object’s methods on

the remote server as if they were local.

Developing Modules

• Develop IDL interface (list of methods)• Must compile IDL with Orbacus’s jidl. Generates

CORBA stubs, skeletons.• Write a Java implementation file

– Defines methods of the interface.

• Compile it all.• Add it to the appropriate server’s configuration file.• Modify moduleServerBean to make it available to

the JSP pages.

IDL Boilerplate

#ifndef _WEBFLOW_#include "../BC.idl“#endifmodule WebFlow{ module myModule { interface myModule:BeanContextChild {

#Insert your methods here. void test();

string execCommand(in string command);…};

}; } ;

Implementation File BoilerPlatepackage WebFlow.myModule;

public class myModuleImpl extends WebFlow.BeanContextChildSupport

implements myModuleOperations {

String msg_;

org.omg.CORBA.Object peer;

public myModuleImpl(org.omg.CORBA.Object peer,

String msg) throws WebFlow.NullPointerException {

super(peer);

this.peer=peer;

String msg_=msg;

}

//Your method definitions go here.

}

Web Portal Security

A review of some security issues and some minimal recommendations.

Multi-tiered Security

• Multiple tiers require security between, within each tier.

• Security issues:– Authentication

– Authorization

– Privacy

• Implementing these end-to-end is a challenge.

Some Minimal Security Suggestions

• Use SSL-enabled Apache web server.

• Disable remote access to Tomcat.

• Use multiple authentication methods– HTTP Authentication– Client certificates

• Use ssh or kerberized rsh, not plain rsh.

• Put on test bed first, log all usage.

Next Steps

• Add meta-job descriptiors to provide better links between HPC and visualization.

• Improved 3D graphics for remote visualization.

• Component interfaces to Condor and Globus.– Globus CoG kits are available for Java.– GPDK provides Bean bridge to CoG.

Some Resources

• Gateway web site: www.gatewayportal.org.– All materials and software can be downloaded

from here.

• Grid Computing Environments: www.computingportals.org.

• My contact info:– Email: pierceme@asc.hpc.mil– Phone: (937)904-5140

Coda: Topics for Lab Session

Hands-on activities for Thursday’s lab.

Lab Topics

• Installing and configuring Tomcat.

• Installing and configuring Apache with SSL.

• Downloading, configuring, and running WebFlow.

• Modifying GEM sample portal JSP pages.

Browser

Charon ClientCharon Module

Web Server And

Servlet Engine

WebFlow Server

HTTP(S)

SECIOP

HTTP

Desktop Client Remote Server

AdministrativeServlet

SubmitCommand

ForwardCommand

CommandInterface

JavaServerPage

JavaServerPage

Request Response

JavaServerPage

ScriptFactory

PBSScript

Generator

GRDScript

Generator

LSFScript

Generator

PBSScript

LSFScript

GRDScript

Script Generator Superclass

A B

WebFlow Parent Server

Child Server A Child Server B

Proxy Images

Modules

User Contexts