Outline

42
Widening the number of e-Infrastructure users with Science Gateways and Identity Federations Giuseppe Andronico ([email protected]) INFN - Italy Workshop on Science Applications And Infrastructure In Clouds And Grids– Oxford,15-16 March 2012

description

Widening the number of e-Infrastructure users with Science Gateways and Identity Federations Giuseppe Andronico ([email protected]) INFN - Italy. Workshop on Science Applications And Infrastructure In Clouds And Grids– Oxford,15-16 March 2012. Outline. - PowerPoint PPT Presentation

Transcript of Outline

Page 1: Outline

Widening the number of e-Infrastructure users

with Science Gateways and Identity Federations

Giuseppe Andronico ([email protected]) INFN - ItalyWorkshop on Science Applications And Infrastructure In Clouds And

Grids– Oxford,15-16 March 2012

Page 2: Outline

Outline Introduction and driving considerations

The Science Gateway paradigm: Architecture Authentication and Authorisation Schema Access workflow Grid transaction model The Authentication process Use cases and statistics The forthcoming Cloud Engine

Summary and conclusions2

Page 3: Outline

Path to technology uptake

The Rogers “bell-shape” curve - Rogers, E. M. (1962), “Diffusion of Innovations”, Glencoe: Free Press.

3

Page 4: Outline

IT acceptance model – the Web

Davis, F. D. (1989), "Perceived usefulness, perceived ease of use, and user acceptance of information technology", MIS Quarterly 13(3): 319–340

Development of web browsers

The World Wide Web

4

Page 5: Outline

5

The evolution leap in web browsers

evolution leap

5

Page 6: Outline

The eResearch2020 report(http://www.eresearch2020.eu/eResearch%20Brochure%20EN.pdf)

• Some barriers in the adoption of Grids: Changes on Grids means changes on

applications Time required to adapt usual workflows Lack of structure to support anonymous

access Download and installation of applications Interface Slow to get to compared to other

resources Difficult to use in the beginning Time spent to get the application

compiled and running

6

Page 7: Outline

Using Grids is not straightforward

Users have to cope with complex security

procedures, execution scripts, job

description languages, command line based

interfaces and lack of standards.

This makes the learning curve very steep

and keeps non IT-experts away.

Page 8: Outline

Another consideration…

VRCs

# of

use

rs

There is a huge number of non IT-experts out there who do not belong to any constituted Virtual Research Community.

How can we attract them ?8

Page 9: Outline

I have a dream…

Can we increase the number of potential grid users by a factor of 1,000 ?

… or even by a factor of 25,000 and more ?9

Page 10: Outline

A new paradigm: the Science Gateway

“A Science Gateway is a community-developed set of tools, applications, and data that is integrated via a portal or a suite of applications, usually in a graphical user interface, that is further customized to meet the needs of a specific community.”

Teragrid

10

Page 11: Outline

Davis, F. D. (1989), "Perceived usefulness, perceived ease of use, and user acceptance of information technology", MIS Quarterly 13(3): 319–340

Development of Science Gateway

Requirement for sustainability

IT acceptance model – the Grid

11

Page 12: Outline

Primary requirement: building Science Gateways should be like playing with

Sc. G

twy

E

Sc. G

twy

D

Sc. G

twy

C

Sc. G

twy

B

Sc. G

twy

A

• Standards• Simplicity• Easiness of use• Re-usability

12

Page 13: Outline

Our reference model

13

....... Science

Gatew

ay

App. 1 App. 2 App. N

Embedded Applications AdministratorPower UserBasic User

Users from different

organisations having different

roles and privileges

Standard-based (SAGA) middleware-independent

Grid Engine

13

Page 14: Outline

AuthN & AuthZ SchemaAuthorisationScience Gateway

GrIDP(“catch-

all”)

IDPCT(“catch-

all”)IDP_y

LDAP

......

...

1. Register to a Service

2. Sign in

Authentication

Social Networks’ Bridge IdP

Page 15: Outline

The Grid IDentity Pool (GrIDP)

(http://gridp.ct.infn.it)

This is a “catch-all” Identity

Federation

Page 16: Outline

eduGAIN(www.edugain.org)

All the Science Gateways are registered as Service Providers

of eduGAIN

16

Page 17: Outline

17

Grid Engine

UsersTracking

DB

Science GW Interface

SAGA/JSAGA API

Job EngineData Engine UsersTrack &Monit.

ScienceGW 1

ScienceGW 2

ScienceGW 3

Grid MWs

Liferay Portlets

eTokenServer

DONE By end of April

Catania Grid Engine

By mid AprilDONE DONE17

Page 18: Outline

Job Engine - Architecture

WT

Worker Threads for Job Submission

WT

Worker Threads forStatus Checking

USERTRACKING

DB

MON

ITOR

ING

MOD

ULE

GRID

INFR

ASTR

UCTU

RE(S

)

Job Queue

WT WT

WT WT WT

WT

WT WT

JobSubmission

JobCheck Status/

Get Output

18

Page 19: Outline

19

Job Engine - Features The Job Engine has been designed with the following features

in mind:Feature Description StatusMiddleware Independent

Capacity to submit job to resources running different middleware

DONE

Easiness Create code to run applications on the grid in a very short time

DONE

Scalability Manage a huge number of parallel job submissions fully exploiting the HW of the machine where the Job Engine is installed

DONE

Performance Have a good response time DONE

Accounting & Auditing

Register every grid operation performed by the users

DONE

Fault Tolerance

Hide middleware failure to end users ALMOST DONE

Workflow Providing a way to easily create and run workflows

IN PROGRESS

Page 20: Outline

20

Job Engine – Scalability

40,000 jobssubmitted in parallel !

Time to submit 10,000 jobs (h)

Job submission time (h) Submission time scales linearly with number of jobs

>10,000 jobs a hour

20

Page 21: Outline

• Both sequential and MPI-enabled jobs successfully executed

• Tests with Globus planned

21

Job Engine – Middleware interoperability

Page 22: Outline

Job Engine – Accounting & Auditing

A powerful accounting & auditing system is included in the Job Engine

It is fully compliant with EGI VO Portal Policy and EGI Grid Security Traceability and Logging Policy

The following values are stored in the DB for each job submitted: User ID Job Submission timestamp Job Done timestamp Application name Job ID Robot certificate ID VO name Execution site (name, latitude, longitude)

22

Page 23: Outline

Catania Science Gateways in numbers

Dec Jan Feb0

50

100

150

200

250

300

350Overall usage (arb. units)

23

09/2011 10/2011 11/2011 12/2011 01/2012 02/2012020406080

100120140160180

Registered users

Page 24: Outline

Data Engine – Requirements A file browser shows Grid files in a tree

File system exposed by the Science Gateway is virtual

Easy transfers from/to Grid (through the SG at the moment) are done in a few clicks

Users do not need to care about how and where their files are really located

24

Page 25: Outline

Data Engine – Usage Workflow

25

1. Sign in

eTokenServer

User Track. DB

DOGS DB

5. File Upload

3. Proxy request

4. Proxy transfer

6. Update DB

7. Upload on Grid7.

Tracking

2. Upload request

25

Page 26: Outline

DOGS: Data On Grid Services – Back-end implementation

26

JSAGA API used to transfer data from/to storage elements

Hibernate to manage the VFS collecting information on files stored on Grid; any changes/actions in the user view affect the VFS

MySQL as underlying RDBMS An additional component has been

developed in order to keep track of each transaction in the users tracking DB

Page 27: Outline

DOGS: Data On Grid Services – Front-end implementation

A portlet has been created wit access provided only to federated users with given roles and privileges

The portlet view component includes elFinder, a web-based file manager developed in Javascript using jQuery UI for a dynamic and user friendly interface http://elrte.org/elfinder

27

Page 28: Outline

Data Engine in action (1/2)

28

Page 29: Outline

Data Engine in action (2/2)

«Share» to be added soon

29

Page 30: Outline

Summary of standards adopted The framework for Science Gateways developed at

Catania is fully web-based and adopts official worldwide standards and protocols, through their most common implementations

These are: The JSR 168 and JSR 286 standards (also known as "portlet 1.0" and

"portlet 2.0" standards) The OASIS Security Assertion Markup Language (SAML) standard and

its Shibboleth and SimpleSAMLphp implementations The Lightweight Direct Access Protocol, and its OpenLDAP

implementation The Cryptographic Token Interface Standard (PKCS#11) standard and

its Cryptoki implementation The Open Grid Forum (OGF) Simple API for Grid Applications (SAGA)

standard and its JSAGA implementation 30

Page 31: Outline

INDICATE ReviewRoberto BarberaLyon, 20/09/2011

http://www.indicate-project.euhttp://indicate-gw.consorzio-cometa.it

Science Gateways in action: e-Culture Science Gateway @ INDICATE

Page 32: Outline

Use the HTTPS interface of Storage Elements

Important for large-size files

Science Gateways in action: e-Culture Science Gateway @ INDICATE

32

Page 33: Outline

Science Gateways in action: e-Culture Science Gateway @ INDICATE

Page 34: Outline

Thanks to the collaboration with

Science Gateways in action: e-Culture Science Gateway @ INDICATE

Page 35: Outline

Science Gateways in action: GATE @ EUMEDGRID

Page 36: Outline

Science Gateways in action: MrBayes @ GISELA

36

Page 37: Outline

37

Science Gateways in action: GridEEG @ DECIDE

37

Page 38: Outline

The CHAIN Application Database(www.chain-project.eu/applications)

Project-specific Science Gateways can be accessed from the CHAIN AppDB

38

Page 39: Outline

Cloud Engine

UsersTracking

DBOCCI API

UsersTrack &Monit.

Cloud App 1

Cloud App 2

Cloud App N

Cloud MW

Cloud Gateway

The forthcoming Cloud Engine

AWS

Page 40: Outline

Host Management Layer: Host Manager Performs physical resources monitoring and VEs allocation

Cluster Management Layer: Cluster Manager Monitoring the overall state of the cluster, “coordinates” HMs

External components: XMPP Server and Distributed Database XMPP advantages: host presence, open standard Central failure point does not exist: fault tolerance mechanism with

multiple CM instances

Virtual execution environment: CLEVER

Page 41: Outline

Summary and conclusions e-Infrastructures can be very beneficial platforms (especially for

cultural heritage), provided they are really «easy to use» Science Gateways with support for Identity Federations and

Social Networks can revolutionize the way Grid infrastructures are used, hugely widening their potential user base, especially non-IT experts and the “citizen scientist”

The adoption of standards (JSR 286, SAGA, SAML, etc.) represents a concrete investment towards sustainability

By design, the components (the “portlets” – our “Lego bricks”) of our Science Gateways have maximum re-usability and, indeed, they have been already adopted in/by several projects (CHAIN, DECIDE, EarthServer, EUMEDGRID-Support, GISELA, INDICATE, etc.)

If you want to integrate your applications in our Science Gateways, or simply enable your websites with our authentication tools, please contact me at [email protected]

41

Page 42: Outline

Thank you

42