OSG Operations
description
Transcript of OSG Operations
OSG Operations
All Hands MeetingRob Quick (Ops Coordinator)
Slides by: Scott Teige and Kyle Gross
March 2011 2
Support Overview
• Communications Hub• Coordinate Ticketing & Exchanges• End-user Support• OSG RA• Documentation
March 2011 3
Communications Hub
• 24x7 Telephone – 1-317-278-9699• 24x7 Email – [email protected]• 24x7 Ticket Creation
Leverage the 24 hour coverage of the GRNOC at IU
• Community Notification Tools• Blogspot postings, twitter and RSS feed
http://osggoc.blogspot.com/ Twitter: OSGGOC (test)
• Weekly Operations Meeting Mondays
March 2011 4
Ticketing & Ticket Exchange
• Central OSG Ticket System• GOCTicket interface
http://ticket.grid.iu.edu• Ticket Exchange – SC, GGUS, GOC-TX• 10,000 ticket milestone – 2/22/2011
March 2011 5
End User Support
• OIM Registration http://oim.grid.iu.edu/
• VOMS (MIS, OSGEDU, CSIU)• Certificate Requests• Twiki Support
March 2011 6
OSG RA
• Alain Deximo as new OSG RA• Updating Procedures/Docs for effective
backup
• Other than new POC (Alain), transparent to users
March 2011 7
Documentation
• Work with OSG Documentation Team Help them with Twiki setup https://twiki.grid.iu.edu/twiki
• Cleaning up Operations Docs
March 2011
Service Overview
• Information Services Information to people Information to machines
• Accounting Services• Monitoring Services• Collaborative Services
March 2011
MyOSG
• http://myosg.grid.iu.edu
March 2011
Display
• http://display.grid.iu.edu/
March 2011
OIM
• Open Science Grid Information Management
• https://oim.grid.iu.edu• Semi-static information to people and
machines• Find contacts, VO information,
resources, much more
March 2011
BDII
• Berkeley Database Information Interface
• Mostly provides information to machines
• Most critical service for GOC• Dynamic information, ~2 minute period• Many services depend on BDII• http://is.grid.iu.edu
Some information to people
March 2011
Ticket
• https://ticket.grid.iu.edu/goc/open• Don’t get stuck, cut a ticket• Ticket Exchange
GOC ticketing system interacts with other support organization ticket systems via the ticket exchange.
Allows seamless interaction of multiple ticket systems, seem to behave as one system.
March 2011
RSV
• Resource and Service Validation
March 2011
WLCG Comparison
• A accounting service• Some OSG resources are also WLCG
resources• Separate accounting systems
March 2011
Software Cache
• http://software.grid.iu.edu• Pointers to VDT software• Certificate Authority Distribution
http://software.grid.iu.edu/pacman/cadist/• VO package• Certificate requests
March 2011 17
xxx-ITB
• Ditto above but for testing• 1st and 3rd Tuesdays updates to ITB
You are encouraged to test services, particularly those of interest to you
• 2nd and 4th Tuesdays updates to Prod.• 5th Tuesday, the GOC rests.
March 2011 18
Change Management and Ops Meetings
• Change Management Review Tuesdays
https://twiki.grid.iu.edu/bin/view/Operations/ChangeMgmtMeetingMinutes
March 2011 19
Recap from the Ops Coordinator
• 15 Minutes• Sustainability• “Yet, in spite of these spectacular
strides in science and technology, and still unlimited ones to come, something basic is missing… We have learned to fly the air like birds and swim the sea like fish, but we have not learned the simple art of living together as brothers.” -MLK
Three things you’ve just gotta know about the VDT
(And Frank)Alain Roy
Open Science Grid Software Coordinator
March 2011
But first a poem
21
I have a flower on my headBy Andrea Roy
I have aFlower on my headWhat should I do?Should I water it? I think so.
March 2011
The three things you just gotta know about the VDT
1. RSV is way cooler2. RPMs for the VDT are on the way3. CREAM is coming to the VDT soon
22
March 2011
1. RSV is way cooler
As of February 7th, OSG 1.2.17, RSV is just so much cooler for two main reasons:1. Common RSV tasks are made simple with
the new rsv-control command.2. It is really easy to extend RSV with new
probes If you can write a script to test something, you
can put it into RSV. Is there something else you’d like to test?
3. Standalone installations are much easier (with config.ini)
23
March 2011
Easy to list your RSV probes!% rsv-control --list
Metrics enabled for host: osg-edu.cs.wisc.edu:10443 | Service ----------------------------------------------------------+-------- org.osg.srm.srmcp-readwrite | OSG-SRM org.osg.srm.srmping | OSG-SRM
Metrics enabled for host: osg-edu.cs.wisc.edu | Service ----------------------------------------------------------+---------org.osg.batch.jobmanager-default-status | OSG-CE org.osg.batch.jobmanagers-available | OSG-CE org.osg.certificates.cacert-expiry | OSG-CE... 24
March 2011
Easy to see the RSV jobs!
25
% rsv-control --job-list
Hostname: osg-edu.cs.wisc.edu ID OWNER ST NEXT RUN TIME METRIC 659.0 rsv I 03-06 10:08 org.osg.globus.gridftp-simple 660.0 rsv I 03-06 09:32 org.osg.gip.lastrun 661.0 rsv R 03-06 18:47 org.osg.general.vdt-version ... Hostname: osg-edu.cs.wisc.edu:10443 ID OWNER ST NEXT RUN TIME METRIC 655.0 rsv I 03-06 09:33 org.osg.srm.srmping 656.0 rsv R 03-06 09:28 org.osg.srm.srmcp-readwrite
ID OWNER ST CONSUMER 679.0 rsv R html-consumer 680.0 rsv R gratia-consumer
March 2011
Easy to enable/disable RSV probes!
26
% rsv-control --enable --host osg-edu.cs.wisc.edu \ org.osg.ress.ress-classad-exists
Enabling metric 'classad-exists' for host 'osg-edu.cs.wisc.edu'
One or more metrics have been enabled and will be started the nexttime RSV is started. To turn them on immediately run 'rsv-control--on'.
March 2011
Easy to run a probe right now!
27
% rsv-control --run --host osg-edu.cs.wisc.edu org.osg.general.osg-version
Running metric org.osg.general.osg-version:
metricName: org.osg.general.osg-versionmetricType: statustimestamp: 2011-03-06 09:24:42 CSTmetricStatus: OKserviceType: OSG-CEserviceURI: osg-edu.cs.wisc.edugatheredAt: osg-edu.cs.wisc.edusummaryData: OKdetailsData: OSG 1.2.18EOT
March 2011
Easy to run all probes to refresh
28
% rsv-control --run –all-enabled
Running metric org.osg.certificates.cacert-expiry (1 of 24)
metricName: org.osg.certificates.cacert-expirymetricType: statustimestamp: 2011-03-07 07:40:40 CSTmetricStatus: OKserviceType: OSG-CEserviceURI: osg-edu.cs.wisc.edugatheredAt: osg-edu.cs.wisc.edusummaryData: OKdetailsData: Security Probe Version: 1.1OK: CAs are in sync with OSG distributionEOT
Running metric org.osg.general.osg-directories-CE-permissions (2 of 24)...
March 2011
Straightforward to get debugging info
29
% rsv-control --verify
Testing if Condor-Cron is running...OK
Testing if metrics are running...OK (24 running metrics)
Testing if consumers are running...OK (2 running consumers)
Checking which consumers are configured...The following consumers are enabled: html-consumer gratia-consumer
% rsv-control --profileRunning the rsv-profiler...OSG-RSV ProfilerAnalyzing...Making tarball (rsv-profiler.tar.gz)
March 2011
And now a slight detour: Frank
• Frank [last-name removed]• Wrote some code for Condor that “worked”.• But he meant:
Works == Compiles• A common mistake for beginners, so we
won’t hold it against him.• But it’s a useful indication of progress:
A lot has been done, but it requires more before you can test it.
30
March 2011
2. RPMs for the VDT are on the way
• We have franked binary RPMs without configuration for: gLexec
(Actually, they’ve been tested pretty well) Xrootd 95% of the worker node (56/59 RPMs)
Currently missing: FTS client
• They are in a yum repo, will be available for testing soon.
31
March 2011
3. CREAM is coming to the VDT soon
• Basic CREAM install via Pacman Currently franks, but known problems End of March
• CREAM install via RPMs End of April
And then a period of testing/finalizing
• Ready for production by September• Timeline driven by ATLAS needs
32
March 2011
I’m happy if you leave with those three things
1. RSV is way cooler2. RPMs for the VDT are on the way3. CREAM is coming to the VDT soon
But I’ll say a two more things:
33
March 2011
Two More Things
• Plan for next round of OSG: Do RPMs right: source packages, intermix
with external dependencies neatly… Community-oriented distributions
• We are getting better about collecting accurate requirements and reporting work plans/time lines
34
March 2011
But wait! There’s more!
• The Second Annual OSG Summer School! June 26-30, 2011 Learn about high-throughput computing,
OSG, and more! Tell anyone that would be interested,
spread the word! https://twiki.grid.iu.edu/bin/view/Education/OSGSummerSchool2011
35
March 2011
Any Questions?
• I’m here until Thursday—please come and talk to me.
• Or email me: [email protected]
36