WP3 R-GMA: A Relational Grid information and monitoring system Steve Fisher / RAL 13/12/2002.

34
WP3 R-GMA: A Relational Grid information and monitoring system Steve Fisher / RAL 13/12/2002 <s.m.fisher@ rl .ac. uk >

Transcript of WP3 R-GMA: A Relational Grid information and monitoring system Steve Fisher / RAL 13/12/2002.

Page 1: WP3 R-GMA: A Relational Grid information and monitoring system Steve Fisher / RAL 13/12/2002.

WP3

R-GMA: A Relational Grid information and monitoring system

Steve Fisher / RAL13/12/2002

<[email protected]>

Page 2: WP3 R-GMA: A Relational Grid information and monitoring system Steve Fisher / RAL 13/12/2002.

Steve Fisher/RAL - 13/12/2002R-GMA 2

WP3Representing

• Heriot-Watt, Edinburgh – Andrew W Cooke, Werner Nutt

• IBM-UK– James Magowan, Manfred Oevers, Paul Taylor

• Queen Mary, University of London– Ari Datta

• PPARC – Rob Byrom, Steve Hicks, Laurence Field, Manish Soni, Antony J.

Wilson, Xiaomei Zhu

• Rutherford Appleton Laboratory – Linda Cornwall, Abdeslem Djaoui, Steve Fisher

• SZTAKI, Hungary – Norbert Podhorszki

• Trinity College Dublin– Brian Coghlan, Stuart Kenny, David O’Callaghan, John Ryan

Page 3: WP3 R-GMA: A Relational Grid information and monitoring system Steve Fisher / RAL 13/12/2002.

Steve Fisher/RAL - 13/12/2002R-GMA 3

WP3

What is wrong with LDAP based solutions?

• Monitoring and information systems should be integrated – Information may be historical

• For monitoring• With time stamps

• You may want information streamed to you– Subscribe to information

• There are very few hierarchies in the real world– Name a commercial HDBMS

• System should allow you to publish what you want – Current systems do not allow users to define and publish their

own information

• System should allow you to find out what you want – LDAP tree has to be carefully designed to answer preconceived

questions only

Page 4: WP3 R-GMA: A Relational Grid information and monitoring system Steve Fisher / RAL 13/12/2002.

Steve Fisher/RAL - 13/12/2002R-GMA 4

WP3GMA

• From GGF• Very simple model• Does not define:

– What registry looks like

– How data are moved from Producer to Consumer

– etc.

Producer

Consumer

Registry

Store location

Lookup

locatio

n

execute or

stream

Page 5: WP3 R-GMA: A Relational Grid information and monitoring system Steve Fisher / RAL 13/12/2002.

Steve Fisher/RAL - 13/12/2002R-GMA 5

WP3R-GMA

• Use the GMA from GGF

• A relational implementation

• Applied to both information and monitoring

• Creates impression that you have one RDBMS per VO

Producer

Consumer

Registry

Store location

Lookup

locatio

n

execute or

stream

Page 6: WP3 R-GMA: A Relational Grid information and monitoring system Steve Fisher / RAL 13/12/2002.

Steve Fisher/RAL - 13/12/2002R-GMA 6

WP3Relational Approach

• Not a general distributed RDBMS system, but a way to use the relational model in a distributed environment where global consistency is not important.

• Producers announce: SQL “CREATE TABLE” publish: SQL “INSERT”

• Consumers collect: SQL “SELECT”

Page 7: WP3 R-GMA: A Relational Grid information and monitoring system Steve Fisher / RAL 13/12/2002.

Steve Fisher/RAL - 13/12/2002R-GMA 7

WP3Not just one Producer• DataBaseProducer

– Relatively slow– Information not lost– Clean up strategy needed– No streaming (though could be defined in principle)– Supports joins

• StreamProducer – Fast– Uses an SQL parser – no RDBMs involved– Holds data in memory– Does not support joins– Can define minimum retention period

Page 8: WP3 R-GMA: A Relational Grid information and monitoring system Steve Fisher / RAL 13/12/2002.

Steve Fisher/RAL - 13/12/2002R-GMA 8

WP3Not just one Producer

• ResilientProducer – Like the StreamProducer but won’t lose data if

system crashes– So slightly slower

• LatestProducer– Just holds the latest information for any

“primaryish” key– Supports joins

Page 9: WP3 R-GMA: A Relational Grid information and monitoring system Steve Fisher / RAL 13/12/2002.

Steve Fisher/RAL - 13/12/2002R-GMA 9

WP3Canonical Producer• Allows user defined code to be invoked to respond to

SQL query• Developed in collaboration with CrossGrid

CPAPI

User Code

CanonicalProducerServlet

Files

CreateTable, Port, Protocol, Security, SQL Support, Multiple Query Support

Security

Insert

Query

Port

Register

Page 10: WP3 R-GMA: A Relational Grid information and monitoring system Steve Fisher / RAL 13/12/2002.

Steve Fisher/RAL - 13/12/2002R-GMA 10

WP3Producer Inheritance• This is not visible to the user

DataBaseProducer

Cleanable

Insertable

Declarable

APIBase

Concrete Class (an example – some classes inherit from lower down)

Supports clean up mechanism

Allows rows to be inserted

Allows tables to be declared

Methods shared by all our APIs - e.g. disconnect()

Page 11: WP3 R-GMA: A Relational Grid information and monitoring system Steve Fisher / RAL 13/12/2002.

Steve Fisher/RAL - 13/12/2002R-GMA 11

WP3Archiver (Re-publisher)

• It is a combined Consumer-Producer • You just have to tell it what to collect and it

does so on your behalf• It will re-publish to any kind of “Insertable”

Page 12: WP3 R-GMA: A Relational Grid information and monitoring system Steve Fisher / RAL 13/12/2002.

Steve Fisher/RAL - 13/12/2002R-GMA 12

WP3R-GMA: use of components

BS

BS

Archiver of O-Z

StreamProducer

Archiver of I-N

StreamProducer

Archiver of A-H

StreamProducerStreamProducer

StreamProducer

Smith (Wants to be told of each change of state

of his job)

Fitzwilliam (Wants to look at current state of

all his jobs)

Archiver of A-H

LatestProducer

Archiver of I-N

LatestProducer

Archiver of O-Z

LatestProducer

Consumer Consumer

• Each Bookkeeping Server publishes to a StreamProducer

• Archiver has a where clause to collect jobs belonging to a subset of users

• Most queries will be satisfied by one Archiver

Page 13: WP3 R-GMA: A Relational Grid information and monitoring system Steve Fisher / RAL 13/12/2002.

Steve Fisher/RAL - 13/12/2002R-GMA 13

WP3A user application: CMS

• BOSS for job tracking on local farm– It currently forks the executable and parses stdout

to publish info directly to an SQL DB– They publish to one table per job type and one

table which is common to all job types

• They will try publishing via R-GMA instead– Providing a scaleable Grid solution

Page 14: WP3 R-GMA: A Relational Grid information and monitoring system Steve Fisher / RAL 13/12/2002.

Steve Fisher/RAL - 13/12/2002R-GMA 14

WP3

R-GMA for parallel applications• GRM used to write a local file information on

parallel applications• PROVE displays this information• GRM is being modified to use R-GMA for

transport

Page 15: WP3 R-GMA: A Relational Grid information and monitoring system Steve Fisher / RAL 13/12/2002.

Steve Fisher/RAL - 13/12/2002R-GMA 15

WP3

Displays - Pulse

A simple Java client

Page 16: WP3 R-GMA: A Relational Grid information and monitoring system Steve Fisher / RAL 13/12/2002.

Steve Fisher/RAL - 13/12/2002R-GMA 16

WP3Command Line• Looks rather like mysql command line• Interactive or one command and quit

Page 17: WP3 R-GMA: A Relational Grid information and monitoring system Steve Fisher / RAL 13/12/2002.

Steve Fisher/RAL - 13/12/2002R-GMA 17

WP3BrowserServlet• JSP application• some fixed common queries• or compose your own

Page 18: WP3 R-GMA: A Relational Grid information and monitoring system Steve Fisher / RAL 13/12/2002.

Steve Fisher/RAL - 13/12/2002R-GMA 18

WP3Displays - Nagios• Looking to include Nagios as a presentation

tool.• Will write a Nagios plug-in to instantiate an

Archiver and use that information to populate Nagios displays

• Can also benefit from Nagios alert mechanism

• Will have different configurations for Site, Country, whole Grid etc.

Page 19: WP3 R-GMA: A Relational Grid information and monitoring system Steve Fisher / RAL 13/12/2002.

Steve Fisher/RAL - 13/12/2002R-GMA 19

WP3R-GMA from user perspective

• APIs in “all” languages– Java, C++, C, Python and Perl

• Easy installation and configuration– For developers– Installers– Users

• Highly portable (mostly Java)• No dependence on other EDG software

currently – but EDG security module is being integrated

Page 20: WP3 R-GMA: A Relational Grid information and monitoring system Steve Fisher / RAL 13/12/2002.

Steve Fisher/RAL - 13/12/2002R-GMA 20

WP3R-GMA – How? • Currently based on servlet technology

– Tomcat– Multiple hand crafted APIs

• Java, C++, C, Python and Perl

– Soft state registration– Uniform exception handling

• To ensure that useful messages and stack traces are preserved.

Page 21: WP3 R-GMA: A Relational Grid information and monitoring system Steve Fisher / RAL 13/12/2002.

Steve Fisher/RAL - 13/12/2002R-GMA 21

WP3R-GMA

• API – Servlet communication– http(s) in

– XML back

Sensor Code

ProducerAPI

Application Code

ConsumerAPI

ProducerServlet

RegistryAPI

Registry Servlet

SchemaAPI

Schema Servlet

Consumer Servlet

RegistryAPI

Page 22: WP3 R-GMA: A Relational Grid information and monitoring system Steve Fisher / RAL 13/12/2002.

Steve Fisher/RAL - 13/12/2002R-GMA 22

WP3Schema & Contributions

CPULoad (Global Schema)

Country Site Facility Load Timestamp

UK RAL CDF 0.3 19055711022002

UK RAL ATLAS 1.6 19055611022002

UK GLA CDF 0.4 19055811022002

UK GLA ALICE 0.5 19055611022002

CH CERN ALICE 0.9 19055611022002

CH CERN CDF 0.6 19055511022002

CPULoad (Producer 3)

CH CERN ATLAS 1.6 19055611022002

CH CERN CDF 0.6 19055511022002

CPULoad (Producer 1)

UK RAL CDF 0.3 19055711022002

UK RAL ATLAS 1.6 19055611022002

CPULoad (Producer 2)

UK GLA CDF 0.4 19055811022002

UK GLA ALICE 0.5 19055611022002

Page 23: WP3 R-GMA: A Relational Grid information and monitoring system Steve Fisher / RAL 13/12/2002.

Steve Fisher/RAL - 13/12/2002R-GMA 23

WP3Contributions are Views

CPULoad (Producer 1)

UK RAL CDF 0.3 19055711022002

UK RAL ATLAS 1.6 19055611022002

CPULoad (Producer 2)

UK GLA CDF 0.4 19055811022002

UK GLA ALICE 0.5 19055611022002

SELECT * FROM cpuLoad

WHERE country = ’UK’ AND site = ’RAL’

SELECT * FROM cpuLoad

WHERE country = ’UK’ AND site = ’GLA’

Page 24: WP3 R-GMA: A Relational Grid information and monitoring system Steve Fisher / RAL 13/12/2002.

Steve Fisher/RAL - 13/12/2002R-GMA 24

WP3The mediator• Producers, associated with views on a virtual data

base. Currently views have the form:– SELECT * FROM <table> WHERE <predicate>

• Queries posed against the virtual data base• The Mediator must:

– find the right Producers– combine information from them

• Can now merge information from several producers • The final mediator will take “any” SQL statement and

do the right thing• The mediator is hidden inside the ConsumerServlet

but is the component which makes R-GMA easy to use

Page 25: WP3 R-GMA: A Relational Grid information and monitoring system Steve Fisher / RAL 13/12/2002.

Steve Fisher/RAL - 13/12/2002R-GMA 25

WP3Registry & schema distribution

• Will have one logical registry and schema per VO

• Each logical registry will have multiple physical “copies”

Producer1 Registry1

Info mastered by Registry1

Copy of info from Registry2

Producer2 Registry2

Info mastered by Registry2

Copy of info from Registry1

Page 26: WP3 R-GMA: A Relational Grid information and monitoring system Steve Fisher / RAL 13/12/2002.

Steve Fisher/RAL - 13/12/2002R-GMA 26

WP3Security• Adding edg-security for authorisation

– Gives secure socket factory for https

• Plan to use VOMS

Page 27: WP3 R-GMA: A Relational Grid information and monitoring system Steve Fisher / RAL 13/12/2002.

Steve Fisher/RAL - 13/12/2002R-GMA 27

WP3OGSIfication• Have recently started the migration to web

and grid services– Apache axis– WSDL generated APIs– Will provide a wrapper for backwards compatibility

Page 28: WP3 R-GMA: A Relational Grid information and monitoring system Steve Fisher / RAL 13/12/2002.

Steve Fisher/RAL - 13/12/2002R-GMA 28

WP3R-GMA - OGSIfication

• API – Servlet communication– http(s) in

– XML back

Sensor Code

ProducerAPI

Application Code

ConsumerAPI

ProducerServlet

RegistryAPI

Registry Servlet

SchemaAPI

Schema Servlet

Consumer Servlet

RegistryAPI

Page 29: WP3 R-GMA: A Relational Grid information and monitoring system Steve Fisher / RAL 13/12/2002.

Steve Fisher/RAL - 13/12/2002R-GMA 29

WP3Step 1 - Isolate Servlets

• API – Servlet communication– http(s) in

– XML back

Sensor Code

ProducerAPI

Application Code

ConsumerAPI

Registry

Consumer Instance

RegistryAPI

RegistryAPI

ProducerInstance

SchemaAPI

Schema

Page 30: WP3 R-GMA: A Relational Grid information and monitoring system Steve Fisher / RAL 13/12/2002.

Steve Fisher/RAL - 13/12/2002R-GMA 30

WP3Step 2 - Web Services

• API – derived from WSDL• Use SOAP• Issue: HTTP Streaming

Sensor

ProducerAPI

Application

ConsumerAPI

Registry

Consumer Instance

ProducerInstance

PortTypes

Consumer “Factory”

Producer “Factory”

Schema

PortTypes

PortTypes

PortTypes

PortTypes

PortTypes

Page 31: WP3 R-GMA: A Relational Grid information and monitoring system Steve Fisher / RAL 13/12/2002.

Steve Fisher/RAL - 13/12/2002R-GMA 31

WP3

• All Grid Services• OGSA Factories, GSH, GSR• Registry includes HandleMapper• SQL as Service Data Element Query Language

ConsumerFactory

ProducerInstance

Step 3 - OGSA

Sensor

ProducerAPI

Application

ConsumerAPI

Schema

RegistryConsumerInstance

ProducerFactory

Page 32: WP3 R-GMA: A Relational Grid information and monitoring system Steve Fisher / RAL 13/12/2002.

Steve Fisher/RAL - 13/12/2002R-GMA 32

WP3OGSIfication issues• Consider XML as internal representation of

service data elements– Depends on other developments

• Consider Xquery as service data elements query language– Depends on how Xquery develops

• X-GMA ??

Page 33: WP3 R-GMA: A Relational Grid information and monitoring system Steve Fisher / RAL 13/12/2002.

Steve Fisher/RAL - 13/12/2002R-GMA 33

WP3When?• In the 24th month of the project R-GMA still

not deployed!• Only bug fixes being made to EDG testbed • To provide exposure and field testing are

starting to deploy widely in the UK and in Italy

Page 34: WP3 R-GMA: A Relational Grid information and monitoring system Steve Fisher / RAL 13/12/2002.

Steve Fisher/RAL - 13/12/2002R-GMA 34

WP3…and finally• It is hard to make an efficient reliable

distributed system without single points of failure and bottlenecks– This is probably not going to surprise anyone

• Most of next year will be spent on trying to achieve reliability and performance– rather than adding much new functionality

• Code is being developed under the EDG open source (BSD style) software license– http://www.edg.org/license.html– All contributions are most welcome