1 Development of GRID environment for interactive applications Jesús Marco de Lucas...

21
1 Development of GRID environment for interactive applications Jesús Marco de Lucas ([email protected] ) Instituto de Física de Cantabria, IFCA Consejo Superior de Investigaciones Científicas, CSIC, Santander, SPAIN DATAGRID DISSEMINATION DAY

Transcript of 1 Development of GRID environment for interactive applications Jesús Marco de Lucas...

Page 1: 1 Development of GRID environment for interactive applications Jesús Marco de Lucas (marco@ifca.unican.es)marco@ifca.unican.es Instituto de Física de Cantabria,

1

Development of GRID environment for

interactive applicationsJesús Marco de Lucas

([email protected])Instituto de Física de Cantabria, IFCA

Consejo Superior de Investigaciones Científicas, CSIC, Santander, SPAIN

DATAGRID DISSEMINATION DAY 14-V-2003 BARCELONA

Page 2: 1 Development of GRID environment for interactive applications Jesús Marco de Lucas (marco@ifca.unican.es)marco@ifca.unican.es Instituto de Física de Cantabria,

Jesús Marco de Lucas DataGrid Dissemination Day (Barcelona 14-V-2003)

2

The EU CrossGrid The EU CrossGrid ProjectProject

European Project ( ~5 M€, 3 year project started March 2002 )proposed to CPA9, 6th IST call, V FPPolish (Cracow & Poznan) / Spanish (CSIC & CESGA) / German (FZK) initiative with the support of CERN (thanks to Fab!)CYFRONET (Cracow) is the coordinator of the project (Michal Turala, project leader)

Objectives:Extension of GRID in Europe, assuring interoperability with DataGridInteractive Applications (“human in the loop”):

• Environmental fields (meteorology/air pollution, flooding crisis management) • High Energy Physics (interactive analysis over distributed datasets) • Medicine (vascular surgery preparation)

Need:• Develop corresponding middleware and tools• Deploy on a pan-european testbed

Partners:Poland (CYFRONET, PSNC, ICM, INP, INS), Spain (CSIC: IFCA, IFIC, RedIRIS, UAB, USC), Germany (FZK, USTUTT, TUM), Slovakia (II SAS), Ireland (TCD), Portugal (LIP), Austria (U.Linz), The Nederlands(UvA), Greece (DEMO, AuTH), Cyprus (UCY)Industry: Datamat (I), Algosystems (Gr)

Page 3: 1 Development of GRID environment for interactive applications Jesús Marco de Lucas (marco@ifca.unican.es)marco@ifca.unican.es Instituto de Física de Cantabria,

Jesús Marco de Lucas DataGrid Dissemination Day (Barcelona 14-V-2003)

3

Surgical PlanningSurgical Planning

Problem: vascular diseases Solution: placement of a

bypass by a surgeon Planning for intervention is

based on 3D images obtained from MRI or CT scans.

The attainable improvement in blood flow should determine which possibility is the best for a particular patient.

A 3D arterial model is built on the basis of the images, and presented to the surgeon in an inmersive intuitive environment

A CT scanner

Stenosis

(narrowing of an artery)

Viewing the arterial structure in an immersive 3D

environment

Observation

Page 4: 1 Development of GRID environment for interactive applications Jesús Marco de Lucas (marco@ifca.unican.es)marco@ifca.unican.es Instituto de Física de Cantabria,

Jesús Marco de Lucas DataGrid Dissemination Day (Barcelona 14-V-2003)

4

Surgical PlanningSurgical Planning

Goal: Simulate vascular reconstruction

Method:Interactive Virtual Reality Environment to

• View scanned data• Define proposed

interventions• View simulation results

Advanced fluid code to simulate flows

Arterial structures from scans with proposed bypasses

Simulated flows

Need Grid in interactive mode (the surgeon should not wait long…)

Access distributed computational resources for flow simulation and visualization, so get a high performance environment at low cost

•Distribute simulations for different bypass configurations

Page 5: 1 Development of GRID environment for interactive applications Jesús Marco de Lucas (marco@ifca.unican.es)marco@ifca.unican.es Instituto de Física de Cantabria,

Jesús Marco de Lucas DataGrid Dissemination Day (Barcelona 14-V-2003)

5

Flood managementFlood managementProblem: flooding crisis in Slovakia

Solution: monitoring, forecasting, simulation, real-time actions

Precipitation forecasts based on meteorological simulations of different resolution from the meso-scale to the storm-scale.

For flash floods, high-resolution (1 km) regional atmospheric models have to be used along with remote sensing data (satellite, radar)

From the quantitative precipitation forecast, hydrological models are used to determine the discharge from the affected area.

Then hydraulic models simulate water flow through various river structures to predict the impact of the flood

Crisis management teams should consult various experts, before making any decisions. The experts should be able to run simulations with different parameters and analyze the impact (“what-if” analysis).

monitorin

g

monitorin

g

forecasting

forecasting

simulation

simulation

Page 6: 1 Development of GRID environment for interactive applications Jesús Marco de Lucas (marco@ifca.unican.es)marco@ifca.unican.es Instituto de Física de Cantabria,

Jesús Marco de Lucas DataGrid Dissemination Day (Barcelona 14-V-2003)

6

Flood managementFlood management

Goal: Flooding risk prediction

Method:Cascade of simulations

• Meteorological• Hydrological• Hydraulic

Virtual Organization

Need Grid in interactive mode (simulation results for “what-if” )

seamlessly connect together experts, data and computing resources needed for quick decisionshighly automated early warning system, based

on hydro-meteorological (snowmelt) rainfall-runoff simulations

Page 7: 1 Development of GRID environment for interactive applications Jesús Marco de Lucas (marco@ifca.unican.es)marco@ifca.unican.es Instituto de Física de Cantabria,

Jesús Marco de Lucas DataGrid Dissemination Day (Barcelona 14-V-2003)

7

Flood managementFlood management

Web portal for access

Job submissionVisualization See DEMO outside

Page 8: 1 Development of GRID environment for interactive applications Jesús Marco de Lucas (marco@ifca.unican.es)marco@ifca.unican.es Instituto de Física de Cantabria,

Jesús Marco de Lucas DataGrid Dissemination Day (Barcelona 14-V-2003)

8

HEP interactive HEP interactive analysisanalysis

The next Large Hadron Collider (LHC) at CERN, will accelerate protons to an energy enough to produce a particle hundreds times heavier: the Higgs Boson, the last piece in the Standard Model, key for understanding the origin of the mass.

Problem: All collisions will be recorded by sophisticated detectors, and the information stored in distributed databases with a volume of millions of gigabytes. But only few of those complex collisions will produce a Higgs Boson…

Solution: On-line filtering techniques + sophisticated mathematical algorithms for physics analysis, like neural networks

Physicists across the world are collaborating in this search…

level 1 - special hardware

40 MHz (40 TB/sec)75 KHz (75 GB/sec)

5 KHz (5 GB/sec)100 Hz(100 MB/sec)

data recording &offline analysis

level 2 - embedded processors

level 3 - PCs

Page 9: 1 Development of GRID environment for interactive applications Jesús Marco de Lucas (marco@ifca.unican.es)marco@ifca.unican.es Instituto de Física de Cantabria,

Jesús Marco de Lucas DataGrid Dissemination Day (Barcelona 14-V-2003)

9

HEP interactive HEP interactive analysisanalysis

Goal: Physics analysis on large distributed databases

Method:Distributed computing for

• Access to databases• Complex algorithms,

like Neural Networks

Use Web Portal as GUI Need Grid in interactive mode (physicists try different hypos)

Reduce the waiting time to test a new algorithm or a new hypothesis from hours down to minutes by processing in distributed mode (DEMO TODAY)

Page 10: 1 Development of GRID environment for interactive applications Jesús Marco de Lucas (marco@ifca.unican.es)marco@ifca.unican.es Instituto de Física de Cantabria,

Jesús Marco de Lucas DataGrid Dissemination Day (Barcelona 14-V-2003)

10

Meteo and Air PollutionMeteo and Air Pollution

Problem: Improve local predictions and refine air-pollution modeling close to a thermical power plant.

Solution: data-mining on databases of outputs from atmospheric circulation models, to improve downscaling

Typical database (ERA-15, ECMWF) Daily forecasts on a reticule covering the globe from 1979-1993

Atmospheric circulation pattern:

v=(T(1ooomb), T(850mb),...,Z,H...) The dimension can reach 104

Page 11: 1 Development of GRID environment for interactive applications Jesús Marco de Lucas (marco@ifca.unican.es)marco@ifca.unican.es Instituto de Física de Cantabria,

Jesús Marco de Lucas DataGrid Dissemination Day (Barcelona 14-V-2003)

11

Meteo and Air PollutionMeteo and Air Pollution

Goal: Data-mining on databases and improvement on air-pollution prediction

Method:Distributed computing for

• Data-Mining algorithm SOM

• Air-Pollution STEM II

Need Grid in interactive mode (so the power plant reacts on time)

Try different air-pollution estimations according to meteo predictions

Atmospheric circulation pattern:

v=(T(1ooomb), T(850mb),...,Z,H...) The dimension can reach 104

SIMILAR PATTERNS close in the grid and in the CPs

space!!

2/1/1979

1/1/1979

Page 12: 1 Development of GRID environment for interactive applications Jesús Marco de Lucas (marco@ifca.unican.es)marco@ifca.unican.es Instituto de Física de Cantabria,

Jesús Marco de Lucas DataGrid Dissemination Day (Barcelona 14-V-2003)

12

Application Application developmentdevelopment

Good interaction with final user community (clear use cases)Vascular Surgery: Leiden HospitalFlooding crisis management: authorities in SlovakiaHEP interactive physics analysis: LHC physicistsMeteo and Air Pollution: power plant managers

Middleware and Tools (significative effort):Basic middleware: Globus 2 + DataGrid Distributed computing using MPI: MPICH-G2

• Support for correct use of MPI: profiling interface (MARMOT) • Benchmarking on a grid context and performance prediction

Optimization of data accessMonitoring:

• the application itself, the network use, and the hardwareScheduling:

• Support for allocation with priority of resources needed for MPI Portals and Roaming Access

• Web Portal + VNC (Migrating Desktop)

Testbed:Support development, test and deployment of applications, tools, and middleware

Page 13: 1 Development of GRID environment for interactive applications Jesús Marco de Lucas (marco@ifca.unican.es)marco@ifca.unican.es Instituto de Física de Cantabria,

Jesús Marco de Lucas DataGrid Dissemination Day (Barcelona 14-V-2003)

13

Migrating Desktop

Multiple Sites

Application

OCM-G

Data Access

Portal and RoamingAccess

InfrastructureMonitoring

Scheduling Agent

DataGrid JobManagement

DataGrid DataManagement

Benchmark

Globus Toolkit

User InteractionServices

Grid VisualizationKernel

Tool

(Parallel) Application Running

Simulation Output

ArchitecturArchitecturee

Page 14: 1 Development of GRID environment for interactive applications Jesús Marco de Lucas (marco@ifca.unican.es)marco@ifca.unican.es Instituto de Física de Cantabria,

Jesús Marco de Lucas DataGrid Dissemination Day (Barcelona 14-V-2003)

14

The CrossGrid The CrossGrid TestbedTestbed

16 sites (small & large) in 9 countries, connected through Géant + NReNs+ Grid Services: EDG middleware (based on Globus) RB, VO, RC…

UCY NikosiaDEMO Athens

Auth Thessaloniki

CYFRONET Cracow

ICM & IPJ Warsaw

PSNC Poznan

CSIC IFIC Valencia

UAB Barcelona

CSIC-UC IFCA

Santander

CSIC RedIris Madrid

LIP Lisbon

USC Santiago

TCD Dublin

UvA Amsterdam

FZK Karlsruhe

II SAS Bratislava

Géant

Page 15: 1 Development of GRID environment for interactive applications Jesús Marco de Lucas (marco@ifca.unican.es)marco@ifca.unican.es Instituto de Física de Cantabria,

Jesús Marco de Lucas DataGrid Dissemination Day (Barcelona 14-V-2003)

15

Com3

UserInterface

Com3

IDC

Worker Nodes GatekeeperStorageElement

Using the TestbedUsing the Testbed Parallel Jobs (HEP Prototype using MPICH-G2)

Running Across Sites

Com3

IDC

Worker Nodes GatekeeperStorageElement

UserInterface

ConfigurationMachine(LCFG)

Com3

IDC

Worker Nodes GatekeeperStorageElement

UserInterface

ConfigurationMachine(LCFG)

Grid Services (LIP)

Site 1

Site i

…Com3

IDC

Monitoring

ResourceBroker

ReplicaCatalogue

VirtualOrganization

Data General

MyProxyServer

latigid

network

IIJSSLB

Globus

Globus

Globus

Globus

Globus

Globus

Page 16: 1 Development of GRID environment for interactive applications Jesús Marco de Lucas (marco@ifca.unican.es)marco@ifca.unican.es Instituto de Física de Cantabria,

Jesús Marco de Lucas DataGrid Dissemination Day (Barcelona 14-V-2003)

16

Testbed StatusTestbed Status

http://mapcenter.lip.pt

Page 17: 1 Development of GRID environment for interactive applications Jesús Marco de Lucas (marco@ifca.unican.es)marco@ifca.unican.es Instituto de Física de Cantabria,

Jesús Marco de Lucas DataGrid Dissemination Day (Barcelona 14-V-2003)

17

User SupportUser Support

Software repository

http://gridportal.fzk.de

Customized GNU Savannah (based on SourceForge )CVS browsable repositoryMain current usage:

• ca. 1000 web-hits per day 7000 files, 356MB, 850.000 code-lines, 15.000 doc-lines + 174 doc/pdf-files

Page 18: 1 Development of GRID environment for interactive applications Jesús Marco de Lucas (marco@ifca.unican.es)marco@ifca.unican.es Instituto de Física de Cantabria,

Jesús Marco de Lucas DataGrid Dissemination Day (Barcelona 14-V-2003)

18

Integration work…Integration work…

Page 19: 1 Development of GRID environment for interactive applications Jesús Marco de Lucas (marco@ifca.unican.es)marco@ifca.unican.es Instituto de Física de Cantabria,

Jesús Marco de Lucas DataGrid Dissemination Day (Barcelona 14-V-2003)

19

IST DemonstrationIST Demonstration

CrossGrid has participated in the World Grid demonstration involving European and US sites from CrossGrid, DataGrid, GriPhyN and PPDG, that took place in November 2002.It was the largest grid testbed in the world.Applications from the CERN/LHC experiments CMS and AtlasCrossGrid participated with 3 sites:

LIP - Lisbon FZK - Karlsruhe IFIC - Valencia

Page 20: 1 Development of GRID environment for interactive applications Jesús Marco de Lucas (marco@ifca.unican.es)marco@ifca.unican.es Instituto de Física de Cantabria,

Jesús Marco de Lucas DataGrid Dissemination Day (Barcelona 14-V-2003)

20

Extending the GRID in Extending the GRID in EuropeEurope

Close collaboration and complementarity with DataGrid Interactive and parallel applicationsExtending the GRID into new countries and communitiesKeeping interoperability, in particular for the testbed

Outreach and dissemination (visit our booth outside !!!):High impact at the national research level:

• See Poland, Germany, Spain, Greece examplesACROSSGRID conference in Santiago de Compostela, great success!Dissemination effort to new communities (i.e. SouthEast Europe, Latin America)New application areas start to be interestedReforcing effort via GridStart (concertation meeting in June, 18-19)Starting to establish company and final user contacts:

• Companies interested in middleware and tools• Institutions and companies interested as final users

Involved in proposals for new 6th FP:HealthGridFloodGridRT Grids…

Page 21: 1 Development of GRID environment for interactive applications Jesús Marco de Lucas (marco@ifca.unican.es)marco@ifca.unican.es Instituto de Física de Cantabria,

Jesús Marco de Lucas DataGrid Dissemination Day (Barcelona 14-V-2003)

21

Extending the GRID in Extending the GRID in EuropeEurope

…and pushing for a common grid infrastructure for e-Science in Europe:

EGEE

Keep in contact with us:http://www.eu-crossgrid.org

Thanks in advance for your interest!