Operational Tools M1 Update

12
EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE www.eu-egee.org EGEE and gLite are registered trademarks Operational Tools M1 Update James Casey SA1 Management Meeting CERN

description

Operational Tools M1 Update. James Casey SA1 Management Meeting CERN. Summary of milestone timeline. We are here…. M1 Features - April 2009. DONE. DONE. DONE. IN PROGRESS. DONE. DONE. Regional Dashboard ‘Regionalized’ dashboards at IN2P3 using current SAM tests for alarms GOCDB - PowerPoint PPT Presentation

Transcript of Operational Tools M1 Update

Page 1: Operational Tools M1 Update

EGEE-III INFSO-RI-222667

Enabling Grids for E-sciencE

www.eu-egee.org

EGEE and gLite are registered trademarks

Operational Tools M1 Update

James CaseySA1 Management MeetingCERN

Page 2: Operational Tools M1 Update

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

Summary of milestone timeline

2

We are here…

Page 3: Operational Tools M1 Update

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

M1 Features - April 2009• Regional Dashboard

– ‘Regionalized’ dashboards at IN2P3 using current SAM tests for alarms

• GOCDB– Programmatic Interface (XML over HTTP) available – GOCDB 4 schema deployed with current data inserted for

validation• Configuration repositories

– Aggregate Topology Provider (ATP) What resources should I test ?

– Metric Description Database What tests should I use ?

• Gstat– First prototype of new monitoring (based on Nagios) done

3

DONE

DONE

DONE

DONE

DONE

IN PROGRESS

Page 4: Operational Tools M1 Update

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

M1 Features – April 09• ROC level nagios based monitoring available

– Configured from Metric Description Database and ATP– ‘SAM Portal’ level of visualization complete

• Full Nagios testing of all resources in grid running– At CERN – Central system, simulating 11 ROCs– Used to validate equivalence to SAM– Availability calculation using current algorithm but with new

metrics• QR Reporting Portal (MSA1.3)

– Initial version with metrics for job usage implemented• Accounting

– Central infrastructure for ActiveMQ based accounting deployed– Consumer and Producer developed

4

DONEPENDING

IN PROGRESS

DONE

DONE

DONEDONE

DONE

IN PROGRESS

Page 5: Operational Tools M1 Update

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

M1 Objectives summary• Mostly completed

– Issues understood and where appropriate new timelines in place• Nagios probe equivalence took longer than expected

and delayed some other components– ‘SAM Portal’, SLA Calculation

• Reduced effort at CERN due to re-hiring interviews and hardware provisioning delays slowed delivery of some components– ATP, Metric Store

• All details in ‘Operations Automation Team Milestone 1 Summary’

– https://espace.cern.ch/sa1-share/oat/Shared%20Documents/Milestones%20and%20Deliverables/EGEE-III-SA1-TEC-OAT-M1-Summary-v1_1.pdf

5

Page 6: Operational Tools M1 Update

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

ROC Results in Site Nagios

6

Page 7: Operational Tools M1 Update

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

Gridview summarisation of Nagios results

7

Page 8: Operational Tools M1 Update

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

MyEGEE Portal

8

Page 9: Operational Tools M1 Update

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

MyEGEE Portal

9

Page 10: Operational Tools M1 Update

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

GStat

10

Page 11: Operational Tools M1 Update

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

Support for Messaging• FUSE Message Broker is the same as Apache ActiveMQ. All

ActiveMQ developers are employed by Progress, Inc.  – FUSE is an open-source rebundling of ActiveMQ.  It’s the same code inside.  There

is no cost to ‘just use FUSE’. – Currently bus are being fixed sooner on the FUSE release than the Apache release.

• Issue we are trying to solve: Little expertise in SA1 and no resources for support

• FUSE (http://fusesource.com/) offers support to the activeMQ distribution we are using

• CERN is interested in getting a support contract for a set of core brokers(4-5?), paying for all of them till the end of EGEE III. This will solve the present problem of support and will increase expertise in the team– If somebody else wants to set up a broker later on with the Apache version

software, there’s no problem – ActiveMq and FUSE versions interoperate•  The involved teams consulted through the OAT are happy with

this

11

Page 12: Operational Tools M1 Update

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

Resources

https://twiki.cern.ch/twiki/bin/view/EGEE/OAT_EGEE_III

• Architecture and componentshttps://twiki.cern.ch/twiki/bin/view/EGEE/MultiLevelMonitoringOverview

• Milestone trackinghttps://twiki.cern.ch/twiki/bin/view/EGEE/MultiLevelMonitoringMilestones

12