Dr. Isabel Campos Plasencia (on behalf of the inteugrid team)

34
Interactive European Grid: An interoperable infrastructure targeting interactivity, visualization and parallelism Dr. Isabel Campos Plasencia (on behalf of the inteugrid team) Instituto de Física de Cantabria, IFCA (Santander) Consejo Superior de Investigaciones Científicas (CSIC) EGEE User Forum Manchester, 9th – 11th May 2007

description

Dr. Isabel Campos Plasencia (on behalf of the inteugrid team) Instituto de Física de Cantabria, IFCA (Santander) Consejo Superior de Investigaciones Científicas (CSIC). Interactive European Grid: An interoperable infrastructure targeting interactivity, visualization and parallelism. - PowerPoint PPT Presentation

Transcript of Dr. Isabel Campos Plasencia (on behalf of the inteugrid team)

Page 1: Dr. Isabel Campos Plasencia (on behalf of the inteugrid team)

Interactive European Grid:An interoperable infrastructure

targeting interactivity, visualization and parallelism

Dr. Isabel Campos Plasencia(on behalf of the inteugrid team)

Instituto de Física de Cantabria, IFCA (Santander)Consejo Superior de Investigaciones Científicas (CSIC)

EGEE User ForumManchester, 9th – 11th May 2007

Page 2: Dr. Isabel Campos Plasencia (on behalf of the inteugrid team)

EGEE User Forum, Manchester, 9th – 11th May 2007

The Interactive European Grid

Project acronym int.eu.gri

d

Contract number 031857

Instrumen

t I3

Duration 2 years

may ´06-april ´08

Coordinator: Jesús Marco de Lucas, CSIC

“providing transparently the researcher’s desktop with the power of a supercomputer, using distributed resources”

http://www.interactive-grid.eu

Page 3: Dr. Isabel Campos Plasencia (on behalf of the inteugrid team)

EGEE User Forum, Manchester, 9th – 11th May 2007

Outline of the presentation

Objectives & challenges of

int.eu.grid

Applications requirements

Middleware versus Apps.

MPI Support

Interactive steering

Visualization

Example

Open-MPI

Grid Visualization

Page 4: Dr. Isabel Campos Plasencia (on behalf of the inteugrid team)

EGEE User Forum, Manchester, 9th – 11th May 2007 4

From the Applications point of view

Analyze requirements of reference applicationsEnsure that middleware copes the reference applications demands Application Porting SupportPromote collaborative environments like AccessGrid

From the Infrastructure point of view

Operate a production level infrastructure 24/7Support Virtual Organizations at all levels

Running the VO (user support)

The challenge of a stable infrastructure: int.eu.grid

From the middleware point of view

Parallel Computing (MPI)Support intracluster Jobs with OpenMPISupport intercluster Jobs with PACX-MPI

Advanced visualization tools allowing simulation steering

GVid, gloginA Job scheduler that supports it allUser friendly interface to the grid supporting all this features

Integrating in the Migrating Desktop all the features

Page 5: Dr. Isabel Campos Plasencia (on behalf of the inteugrid team)

EGEE User Forum, Manchester, 9th – 11th May 2007

This is our infrastructure

Page 6: Dr. Isabel Campos Plasencia (on behalf of the inteugrid team)

EGEE User Forum, Manchester, 9th – 11th May 2007

Applications Requirements

Understanding the Application User input to NA team

Description in terms of Area of knowledge and status of the artResults expected and impact on the scientific communityUnderstanding the computational approach at the algorithmic level

Resources neededSoftware & HardwareGRID services

GRID added valueWhy on the GRID ?

Interactive environment Graphics & Visualization Quality of Service and network reliability

Page 7: Dr. Isabel Campos Plasencia (on behalf of the inteugrid team)

EGEE User Forum, Manchester, 9th – 11th May 2007

Project Pilot Applications

AstrophysicsFusion

Medical ImagingEnvironment

Page 8: Dr. Isabel Campos Plasencia (on behalf of the inteugrid team)

EGEE User Forum, Manchester, 9th – 11th May 2007

IMS Model Suite

Height above Surface in m

Applications in Environmental Research

Evolution of pollution clouds in the atmosphere

Page 9: Dr. Isabel Campos Plasencia (on behalf of the inteugrid team)

EGEE User Forum, Manchester, 9th – 11th May 2007

Analysis of CMB maps (Astrophysics)

Page 10: Dr. Isabel Campos Plasencia (on behalf of the inteugrid team)

EGEE User Forum, Manchester, 9th – 11th May 2007

Pattern: Requirements for Middleware

Distributing the task among N processors

MPI support

The Job should be started inmediately on the user desktop

MPI Interactive job scheduling

The graphical interface should be forwarded to the user desktop

Graphical interface to the grid Migrating DesktopSupporting Visualization GVid

The user should be able to steer the simulation

Real Time steering glogin

Page 11: Dr. Isabel Campos Plasencia (on behalf of the inteugrid team)

EGEE User Forum, Manchester, 9th – 11th May 2007

GRID MPI Support

Why MPI Support ?The standard API for distributed memory parallelisationWrite once, run everywhereThis is what applications are

What is MPIIs an APIDescription of the semantics, but

NOT the implementationAlmost platform indenpendent

(modulo problems with MPI-IO)

What is NOT MPIThere is no implementationNo specification of how to start

the processesHow to the get the binary on the remote sitesHow to start the binaries on the remote sites (ssh, PBS,…)

Page 12: Dr. Isabel Campos Plasencia (on behalf of the inteugrid team)

EGEE User Forum, Manchester, 9th – 11th May 2007

MPI Support

Why MPI Support ?The standard API for distributed memory parallelisationWrite once, run everywhereThis is what applications are

What is MPIIs an APIDescription of the semantics, but

NOT the implementationAlmost platform indenpendent

(modulo problems with MPI-IO)

What is NOT MPIThere is no implementationNo specification of how to start

the processesHow to the get the binary on the remote sitesHow to start the binaries on the remote sites (ssh, PBS,…)

There are many issues about handling MPI jobs types already

worked out for Linux Clusters, SuperComputers, etc…

which have to be addressed when running MPI on the Grid

in a particular way.

Page 13: Dr. Isabel Campos Plasencia (on behalf of the inteugrid team)

EGEE User Forum, Manchester, 9th – 11th May 2007

Problems of MPI Support on the Grid

• There is no standar way how to start a MPI program

No common Syntax for mpirun

MPI-2 defines mpiexec as starting mechanism, but support for mpiexec is only optional

Resource Brokers should handle different MPI implementations

Different Schedulers and different MPI implementations at each site have different ways to specify the machinefile

• Non-shared filesystems (Oh!)Many Grid sites dont have support for a shared home directoryMany MPI implementations expect that the executable is available in the nodes where the process is startedMixed setup in general: some sites have shared filesystems, some not

Page 14: Dr. Isabel Campos Plasencia (on behalf of the inteugrid team)

EGEE User Forum, Manchester, 9th – 11th May 2007

MPI Support in Grid Environments

In Grid Environments there are two possible cases

Intra Cluster JobsAll processes run on the same cluster

Inter Cluster JobsProcesses are distributed across several clusters/sites

1 2 SIZE

MPI_COMM_WORLD

Collective Communication

P2P . . .

Page 15: Dr. Isabel Campos Plasencia (on behalf of the inteugrid team)

EGEE User Forum, Manchester, 9th – 11th May 2007

Grid Scheduler Language needs “translation” to local scheduler

syntax

CERBUI WN

Lo

cal

Sch

edu

ler

Gen

eral

Gri

d S

ched

ule

r

Translate?NO

RB cannot be updatedoften without compromising

the whole job submissionprocedure

Translate?YES, but how?

Translate?NO,Of course!

Page 16: Dr. Isabel Campos Plasencia (on behalf of the inteugrid team)

EGEE User Forum, Manchester, 9th – 11th May 2007

Problems of MPI Support on the Grid

Our Solution an intermediate layer:

mpi-start

RESOURCE BROKER

MPIImplement.

MPI-START

Scheduler

Page 17: Dr. Isabel Campos Plasencia (on behalf of the inteugrid team)

EGEE User Forum, Manchester, 9th – 11th May 2007

mpi-start

Goals

Hide differences between MPI implementations

Hide differences between local schedulers implementations

Supports simple file distribution Hides from the user the filesystem details (shared or non-shared)

Providing a simple but powerful enough unique interface for the Resource Broker to specify MPI

Jobs

The Resource Broker does not have to contain hardcoded the MPI support

Page 18: Dr. Isabel Campos Plasencia (on behalf of the inteugrid team)

EGEE User Forum, Manchester, 9th – 11th May 2007

mpi-start

mpi-start: design

mpi-start

schedulers MPI hooksO

pen

mp

i

PA

CX

-MP

I

MP

ICH

PB

S

SG

E

file

syst

em

portable ($bash scripting)

Page 19: Dr. Isabel Campos Plasencia (on behalf of the inteugrid team)

EGEE User Forum, Manchester, 9th – 11th May 2007

MPI Job Example

Executable = "IMB-MPI1"; Arguments = "pingpong"; JobType = "Parallel"; JobSubType = "openmpi"; NodeNumber = 16; StdOutput = "std.out"; StdError = "std.err"; OutputSandbox =

{"std.out","std.err"};InputSandbox = {"IMB-MPI1"};

mpirun –machinefile $TMP/machines –np 16 pingpong

Page 20: Dr. Isabel Campos Plasencia (on behalf of the inteugrid team)

EGEE User Forum, Manchester, 9th – 11th May 2007

MPI Job Example

Include in JDL the following:

InputSandbox = {"MyHooks.sh", ....};

Environment = {"I2G_MPI_PRE_RUN_HOOK=./MyHooks.sh",

"I2G_MPI_POST_RUN_HOOK=./MyHooks.sh“

# cat MyHooks.sh pre_run_hook () {echo "pre run hook called "

wget www.myhome.xx/mysources.tgz

tar xzvf mysources.tgz

make …

return 0;}

Page 21: Dr. Isabel Campos Plasencia (on behalf of the inteugrid team)

EGEE User Forum, Manchester, 9th – 11th May 2007

Dissemination Effort

School organized in Dublin at TCD,

Course including Grids and MPI Support

Hosted by TCD (Brian Coghlan)

Date: end of June 2007.

Page 22: Dr. Isabel Campos Plasencia (on behalf of the inteugrid team)

EGEE User Forum, Manchester, 9th – 11th May 2007

MPI Support in Grid Environments

For InterCluster Jobs we support PACX-MPI A middleware for seamlessly run a MPI-application on a network of parallel

computers

PACX-MPI is an optimized standard-conforming MPI- implementation,

application just needs to be recompiled(!)

PACX-MPI uses locally installed, optimized vendor implementations

for cluster inter communication

Cluster 2

PACX-MPI (job)

Application

Open MPI (job)Cluster 1

Open MPI (job)

Page 23: Dr. Isabel Campos Plasencia (on behalf of the inteugrid team)

EGEE User Forum, Manchester, 9th – 11th May 2007

PACX-MPI Design

A grid site has in general the following topologyCE = Computing Element (head node) public IPWN = Worker Nodes, private IP

RequirementsConnectivity of CE to the clusters and start-up daemonsFiles: Application & Input filesStart on daemons on the CE. Connectivity of ssh to CE

3 1

2 0

1

0

2

35

4 CE

WN

A MPI Job requesting N processesper cluster spawns N+2 processes,Two of them in the CE running as

Daemons, making the bridge between clusters

Page 24: Dr. Isabel Campos Plasencia (on behalf of the inteugrid team)

EGEE User Forum, Manchester, 9th – 11th May 2007

PACX MPI – Design

External CommunicationHandled via the Computing Element, the only one with

public IP

TCP/IP daemons do the job

Cluster 1 Cluster 2

3

2

5

4

4

5

2

3

1

0

0

1

4 6

5 7

3 1

2 0

Page 25: Dr. Isabel Campos Plasencia (on behalf of the inteugrid team)

EGEE User Forum, Manchester, 9th – 11th May 2007 25

Example: Visualization of plasma in fusion

devices• The application visualizes the behaviour of plasma inside a Fusion device

• Runs are foreseen as a part of a so called Fusion Virtual Session

• The plasma is analyzed as a many body system consisting of N particles

• Inputs

Geometry of the vacuum chamber

Magnetic field in the environment

Initial number, position, direction, velocity of particles

Possibility of collisions between particles

Density of particles inside the device

• Solves a set of Stochastic Differential Equations with Runge-Kutta method

• Outputs

Trajectories of the particles

Average of relevant magnitudes: densities, temperatures...

TJ-II Stellerator at CIEMAT (Spain)

Graphical Display in using OpenGL with interactive capabilities

Page 26: Dr. Isabel Campos Plasencia (on behalf of the inteugrid team)

EGEE User Forum, Manchester, 9th – 11th May 2007 26

Porting the application to int.eu.grid

Spread the calculation over hundreds of Worker Nodes on the Grid to increase the number of particles in the plasma.

Design of a Grid collaborative environment for fusion device designing and analysis.

N particles distributed among P processes: MPI

Particle trajectories are displayed graphically

Interactive simulation steering

Uses most of the

capabilities

of the int.eu.grid

Middleware

Page 27: Dr. Isabel Campos Plasencia (on behalf of the inteugrid team)

EGEE User Forum, Manchester, 9th – 11th May 2007

Global Schema

MPI + interactive + visualization

Page 28: Dr. Isabel Campos Plasencia (on behalf of the inteugrid team)

EGEE User Forum, Manchester, 9th – 11th May 2007

Middleware for Visualization & Steering

• GloginLightweight tool for support of interactivity on the grid

Grid authenticated shell access “glogin host”

No dedicated daemon needed such as sshd

TCP Port Forwarding enables access to grid worker nodes with private IPs.

X11 Forwarding

• GVidGrid Video Service

Visualization can be executed remotely on a grid resource

Transmits the visualization output to the user desktop

Communication of the interaction events back to the remote rendering machine

Uses Glogin as bi-directional communication channel

Our middleware is based on the combination of a Grid Video Streamer

together with an interactive grid enabled login tool

Page 29: Dr. Isabel Campos Plasencia (on behalf of the inteugrid team)

EGEE User Forum, Manchester, 9th – 11th May 2007

Fusion Application MPI Schema

P1 P0

P2 P3

MPI job distribution

Independent Processes

Master P0 does renderization

MPI synchronization

Every Process does own i/o

Page 30: Dr. Isabel Campos Plasencia (on behalf of the inteugrid team)

EGEE User Forum, Manchester, 9th – 11th May 2007

The User interacts with the Master

process for Visualization and Steering

USER SIDE

Java Gvid Decoder

Event Interception

P0Master

Gvid Encoder

Event Reception

P1 P2

P3

KeyboardMouse

User Screen

Page 31: Dr. Isabel Campos Plasencia (on behalf of the inteugrid team)

EGEE User Forum, Manchester, 9th – 11th May 2007

Graphical Interface to Job submission

Using the Migrating Desktop

Page 32: Dr. Isabel Campos Plasencia (on behalf of the inteugrid team)

EGEE User Forum, Manchester, 9th – 11th May 2007

Running on the Migrating Desktop

Job monitoring

Job logs & details

Page 33: Dr. Isabel Campos Plasencia (on behalf of the inteugrid team)

EGEE User Forum, Manchester, 9th – 11th May 2007

See our DEMO:Simulation steering on the

Migrating Desktop

Page 34: Dr. Isabel Campos Plasencia (on behalf of the inteugrid team)

EGEE User Forum, Manchester, 9th – 11th May 2007

Some related events