The grid aprimer

36
[email protected] 11 February 2009 The Grid : a primer

Transcript of The grid aprimer

Page 1: The grid aprimer

[email protected]

11 February 2009

The Grid : a primer

Page 2: The grid aprimer

1/24

Outline

The Grid concept

Grid architecture

Middleware – the core

Interact with the Grid : first steps

Virtual Organizations

enmr.eu VO

Site grid administration

Page 3: The grid aprimer

2/24

Our world … today!

Network infrastructure ON

Global Sharing resources “OFF”

Page 4: The grid aprimer

3/24

One step further … The Grid

Network infrastructure ON

Global Sharing resources ON

“Coordinated resource sharing and problem solving in dynamic,

multi-institutional virtual organizations”.*

Foster, I. et al., Int. J. Superc. Appli. (2000)15:3

Page 5: The grid aprimer

4/24

Why do scientists need the Grid?

High-energy physics (15 PB/year)

15 PB ~ 20*10^6 CD’s

Complex problems !!

Many iterations !!

Virtual cooperation !!

Genome projects, data mining,

Tackling the protein folding,

Protein structure, …

Page 6: The grid aprimer

5/24

Building a Grid

1. The architecture

2. The hardware

3. The middleware

Page 7: The grid aprimer

6/24

Building a Grid - architecture

Network

Resources

Middleware

Application

User-c

entric

Page 8: The grid aprimer

7/24

Building a Grid - Grid Fabric (I)

Delivery of Advanced Network Technology to Europe

State-of-the-art (1985) = 56 Kbps

Network characterization

Size

Throughput

Page 9: The grid aprimer

8/24

Building a Grid - Grid fabric (II) Computer performance

# Syst. Family Rmax. (GFps)

1 IBM cluster 1105

52 IBM pSeries 48.9

75 IBM BlueGene 35.1

93 IBM BlueGene 27.5

A flop is a basic computational operation

Page 10: The grid aprimer

9/24

Building a Grid - middleware

"Middleware" is the software that organizes and

integrates the resources in a grid.

https://twiki.cern.ch/twiki/bin/view/LCG/GenericInstallGuide310

gLite*

Page 11: The grid aprimer

10/24

How to interact with the Grid The UI service

3 ways to access the Grid – UI service, Web portal, or UI PnP

Page 12: The grid aprimer

11/24

Enabling Grids for E-science

> 140 institutions

> 300 sites

50 countries

> 10.000 users

> 80.000 CPU cores 24/7

WOULD YOU TRUST YOUR COMPUTER TO A COMPLETE STRANGER?

Worldwide LHC Computing Grid (WLCG)

Page 13: The grid aprimer

12/24

Registered EGEE Virtual Organizations

Application domain Active VOs Users

High-energy Physics 36 7994

Life Sciences 8 333

... ... ...

Total 155 16263

Stats : 10 Fev 2009

VO name Scope Registered Users

biomd Gobal 223

bio Regional - Italy 57

enmr.eu Global 54

Page 14: The grid aprimer

13/24

New application web portal

http://haddock.chem.uu.nl/enmr

Page 15: The grid aprimer

14/24

www.enmr.eu

Page 16: The grid aprimer

15/24

The eyes of the Grid

http://gridice-enmr.cerm.unifi.it/site/site.php

Page 17: The grid aprimer

16/24

How to become an enmr.eu user

www.gridcafe.org

Page 18: The grid aprimer

17/24

Trust is the key!

Page 19: The grid aprimer

18/24

http://ca.dutchgrid.nl/request/

Page 20: The grid aprimer

19/24

Fill the form AND

! Pay attention !

1. Organization

2. Organizational Unit

3. Certificate Level: medium

4. Check & re-check!

RA: A. Bonvin

http://ca.dutchgrid.nl/request/

+ ID card

DN : /O=dutchgrid/O=users/O=universiteit-utrecht/OU=chem/CN=Your Name

Proof of Possession

Page 21: The grid aprimer

20/24

Wait a couple of days …

Page 22: The grid aprimer

21/24

https://voms2.cnaf.infn.it:8443/voms/enmr.eu/Login.do

Problems ?

Alexandre

Johan

Nuno

Follow email

instructions !!

Page 23: The grid aprimer

22/24

Wise sentence …

“If you think this is cumbersome… it is nothing

compared to get the grid running.”

van der Zwan, J. ; 26-01-2009 14:43

Page 24: The grid aprimer

23/24

Site Grid Administration A glimpse

Goal:

keep the grid running 24/7

Facts:

more than 30 middleware updates/year

Bugs, bugs, and more bugs

… nevertheless grid is running

How to deal with:

Test b4 putting a service on production

Any more ideas?

Sandbox:

Pre-production: test, destroy, and re-build

The art of computer virtualization*: takes 2 min.

http://www.xen.org/

Page 25: The grid aprimer

24/24

Hardware-centric

/O=dutchgrid/O=users/O=universiteit-utrecht/OU=chem/CN=Alexandre Bonvin

/O=dutchgrid/O=users/O=universiteit-utrecht/OU=chem/CN=Johan van der Zwan

Application layer

/O=dutchgrid/O=users/O=universiteit-utrecht/OU=chem/CN=Marc van Dijk

/O=dutchgrid/O=users/O=universiteit-utrecht/OU=chem/CN=Sjoerd De Vries

/O=dutchgrid/O=users/O=universiteit-utrecht/OU=chem/CN=Tsjerk Wassenaar

User abstraction

*.*

Middleware layer

Acknowlegments

Page 26: The grid aprimer

25/24

Questões colocadas/ Comentários

Protecção contra vírus. Existem mecanismos?

Sistema de prioridades na utilização dos recursos?

Panos gostou da apresentação, excepto do slide acerca da Grid admin (diz que estava for a do contexto)

Num sistema heterogéneo, obtém-se resultados diferentes para o mesmo problema inicial. No entanto, isto tb ocorre na laboratório. È possível no entanto escolher que máquinas usar na grid e que máquinas não usar.

Klartje perguntou se é possível colocar outros programas na Grid. Bonvin respondeu que é possivel enviar o programa junto com o dados.

Dirk perguntou se as comunicações entre os computadores é encriptada.

Page 27: The grid aprimer

26/24

Moore’s Law

Some STATS :

1. Computer power doubles every … 18 months

2. Network performance doubles every … 9 months

3. Data storage density is doubling every … 12 months

“The number of transistors that could be squeezed on

to a silicon chip was doubling every year.” Moore, G. 1965

Every year that passes, The Grid concept becomes more feasible

Distributed processors can be more tightly integrated

Computer grids are increasingly able to solve increasingly complex problems

Page 28: The grid aprimer

27/24

gLite INFNGRID – deployment status

Update Date

40 ? (04 Fev 2009,CERN)

38-39 23 Jan 2009

35-37 05 Dez 2008

32-34 07 Nov 2008

30-31 23 Set 2008

... ...

13 19 Fev 2008

... ...

INFNGRID gLite 3.1 (SL4)

Page 29: The grid aprimer

28/24

Page 30: The grid aprimer

29/24

The GRID is a collection of geographically distributed resources

GRID users:

Organized in Virtual Organizations

Need to run programs without the need to know Where to run a job

Where to get the input data from

Where to store the output data to

The GRID consists of

An Authorisation and Authentication System

An Information System

A Workload Management System

A Data Management System

An Accounting System

Various monitoring services

Various installation services

The GRID architecture: general view

Page 31: The grid aprimer

30/24

The Authentication and Authorization System:

Contains the list of all the people authorized to use the GRID

divided by VO

all machines running Grid services verify the users credentials

map the GRID users to the local users of the machine

The Information System:

provides information about gLite resources and their statuses.

Information published by the individual resources and copied into central databases.

Used by:

WMS: match resources against job requirements and to rank them

DMS: choose storage resources

monitoring systems

The GRID architecture: general view

Page 32: The grid aprimer

31/24

The Workload Management System:

manages jobs submitted by users

matches the job requirements to the available resources

schedules the job for execution on an appropriate computing cluster

tracks the job status

allows the user to retrieve the job output when ready

The Data Management System:

Allows users to

move files in and out of the Grid

replicate files among different locations

locate files.

This is achieved:

transferring data via a number of protocols

GridFTP is the most commonly used

interacting with a central file catalog

The GRID architecture: general view

Page 33: The grid aprimer

32/24

Monitoring Services:

GridICE: monitors the usage of Grid resources

# jobs running, the storage space available …

R-GMA allows users to monitor application

store results in a relational database

Some Monitoring Systems check status of Grid services

more intended for the GRID operations staff

Dedicated Fabric Management Services:

manage installation, upgrade and maintenance local Grid services

LCFGng (dismissed)

Quattor

YAIM (semi automatic tool based on APT/YUM and shell scripts)

The GRID architecture: general view

Page 34: The grid aprimer

33/24

Grid analogy

Electrical Power-Grid The Grid

You never worry about where the electricity you are using comes from.

You would never worry about where the computer power you are using comes from

The infrastructure that makes this possible is called "the power grid".

The infrastructure that makes this possible is called "the Grid".

The power grid is pervasive: electricity is available essentially everywhere and you can imply access it through a standard wall socket

The Grid is be pervasive: remote computing resources would be accessible from different platforms, and you will simply access the Grid through your web browser.

The power grid is a utility: you ask for electricity, and you get it. You also pay for what you get.

The Grid is a utility: you ask for computer power or storage capacity and you get it. You also pay for what you get.

"The Grid" doesn't yet exist in this form; however, the world already has hundreds of smaller grids...

Page 35: The grid aprimer

34/24

Cryptography

A Scytale

Page 36: The grid aprimer

35/24

Evolution of the HDD

Morris, R.J.T. et al , IBM Systems Journal, y.2003, v.42, n.2, pg.205