Status of Tasks Forces Josep Flix (PIC/CIEMAT) On behalf of the WLCG Operations Coordination Team

12
WLCG OpsCoord F2F @ CERN 1 th February 2014 1 Status of Tasks Forces Josep Flix (PIC/CIEMAT) On behalf of the WLCG Operations Coordination Team WLCG Operations Coordination F2F – CERN [11 th February 2014]

description

Status of Tasks Forces Josep Flix (PIC/CIEMAT) On behalf of the WLCG Operations Coordination Team WLCG Operations Coordination F2F – CERN [11 th February 2014]. Ongoing Tasks Forces. Today ’ s detailed talks. I am providing here a brief summary for some other TFs. gLExec deployment. - PowerPoint PPT Presentation

Transcript of Status of Tasks Forces Josep Flix (PIC/CIEMAT) On behalf of the WLCG Operations Coordination Team

Page 1: Status of Tasks Forces Josep Flix (PIC/CIEMAT) On behalf of the WLCG Operations Coordination Team

WLCG OpsCoord F2F @ CERN11th February 2014 1

Status of Tasks Forces

Josep Flix (PIC/CIEMAT)On behalf of the WLCG Operations Coordination Team

WLCG Operations Coordination F2F – CERN[11th February 2014]

Page 2: Status of Tasks Forces Josep Flix (PIC/CIEMAT) On behalf of the WLCG Operations Coordination Team

WLCG OpsCoord F2F @ CERN11th February 2014 2

Ongoing Tasks Forces

Page 3: Status of Tasks Forces Josep Flix (PIC/CIEMAT) On behalf of the WLCG Operations Coordination Team

WLCG OpsCoord F2F @ CERN11th February 2014 3

Today’s detailed talks

‣ I am providing here a brief summary for some other TFs

Page 4: Status of Tasks Forces Josep Flix (PIC/CIEMAT) On behalf of the WLCG Operations Coordination Team

WLCG OpsCoord F2F @ CERN11th February 2014 4

gLExec deployment

‣ Multi-user pilot jobs should make use of gLExec to change user identity. The TF aims to coordinate the deployment of gLExec without interfering with current Exp. workflows

‣ Each site has its gLExec infrastructure regularly tested through SAM tests (at some point to become critical)

‣ # of closed tickets is 75‣ # of open tickets is 20

‣ A serious bug in gLExec has been discovered:https://twiki.cern.ch/twiki/bin/view/LCG/GlexecDeployment#Known_issues

‣ CMS will made gLExec SAM test critical soonhttps://twiki.cern.ch/twiki/bin/view/LCG/GlexecDeploymentTracking

Page 5: Status of Tasks Forces Josep Flix (PIC/CIEMAT) On behalf of the WLCG Operations Coordination Team

WLCG OpsCoord F2F @ CERN11th February 2014 5

perfSONAR deployment

‣ Goal is to encourage all WLCG sites to deploy, configure and register perfSONAR-PS instances gathering network metrics on the network paths for all of the WLCG sites

‣ A new release (3.3.2) is available:‣ sites should upgrade. Procedure is straight-forward and requires

no re-configuration‣ A campaign to get remaining sites installed and out-of-date

installations upgraded is still ongoing (tickets)‣ Several sites are behind this deployment: ‣ WLCG OpsCoord is raising the issue with the WLCG management

and experiments [I. Bird @ LHCONE WS]

https://twiki.cern.ch/twiki/bin/view/LCG/PerfsonarDeployment

Page 6: Status of Tasks Forces Josep Flix (PIC/CIEMAT) On behalf of the WLCG Operations Coordination Team

WLCG OpsCoord F2F @ CERN11th February 2014 6

Tracking Tools Evolution I‣ Developers, deployers, experts of GGUS, SNOW,

Savannah, JIRA and the experiments discuss development options for each tool and interfaces between them, when required

‣ GGUS releases:‣ Last release done on 29thJan. 2014‣ Includes several minor bug fixes, and new WLCG Monitoring SU

‣ Next release: 26th of February‣ Prototype of multiple site notification expected before the

end of the month in a test instance (hopefully in Prod. by March)

Page 7: Status of Tasks Forces Josep Flix (PIC/CIEMAT) On behalf of the WLCG Operations Coordination Team

WLCG OpsCoord F2F @ CERN11th February 2014 7

Tracking Tools Evolution II

‣ Developers, deployers, experts of GGUS, SNOW, savannah, JIRA and the experiments discuss development options for each tool and interfaces between them, when required

‣ Savannah to JIRA migration:‣ Very slow progress in this area ‣ Main issue for the 'GGUS Shopping list' tracker (cross-

references between tickets) still not solved after more than one year

‣ Other trackers do not depend on this functionality, so it might be the moment to accept that these references will be lost during the migration

https://twiki.cern.ch/twiki/bin/view/LCG/TrackingToolsEvolution

Page 8: Status of Tasks Forces Josep Flix (PIC/CIEMAT) On behalf of the WLCG Operations Coordination Team

WLCG OpsCoord F2F @ CERN11th February 2014 8

XrootD deployment

‣ The aim of this task force is to help the deployment at the WLCG sites of the Xrootd federated data storage for the FAX (ATLAS) and AAA (CMS) projects.

‣ Campaign for publishing xrootd endpoints in GOC/OIM is about to start (tickets!!)

‣ this will ease the operations and monitoring effort

https://twiki.cern.ch/twiki/bin/view/LCG/XrootdDeployment

Page 9: Status of Tasks Forces Josep Flix (PIC/CIEMAT) On behalf of the WLCG Operations Coordination Team

WLCG OpsCoord F2F @ CERN11th February 2014 9

SHA-2 Migration‣ How services used by WLCG VOs (ALICE, ATLAS, CMS,

DTEAM, LHCb, ops) can be tested for SHA-2 readiness‣ The EOS SRM for LHCb is not OK yet‣ patch needed to support the "root" protocol expected by

LHCb jobs‣ voms-proxy-init on lxplus crashes when creating SHA-2

RFC proxies (discovered by CMS)‣ works OK with Java-based version provided by voms-clients3

‣ VOMRS:‣ VOMS-Admin test cluster will soon be available‣ host certs of future VOMS service from new SHA-2 CERN CA ‣ campaign to get the new servers recognized in LSC files across the

Grid (also provide such files in rpms)

https://twiki.cern.ch/twiki/bin/view/LCG/SHA2readinessTesting

Page 10: Status of Tasks Forces Josep Flix (PIC/CIEMAT) On behalf of the WLCG Operations Coordination Team

WLCG OpsCoord F2F @ CERN11th February 2014 10

Machine/Job Features

‣ Machine/Job features to provide information from a resource provider (batch system, IaaS) to the payload:

‣ static (eg. power of the machine, number of cores, local scratch space)

‣ dynamic (eg. shutdown time of a VM)

‣ Current prototype at CERN lxbatch (bare metal / vWNs):‣ received feedback from ALICE who were testing mjf on the CERN batch nodes

(waiting for feedback from ATLAS/CMS)‣ For cloud-like installations the TF has decided to look into alternatives of

communicating the features:‣ investigating nosql key/value stores as a viable alternative. A test instance

has been setup and is being validated right nowhttps://twiki.cern.ch/twiki/bin/view/LCG/MachineJobFeatures

Page 11: Status of Tasks Forces Josep Flix (PIC/CIEMAT) On behalf of the WLCG Operations Coordination Team

WLCG OpsCoord F2F @ CERN11th February 2014 11

IPv6 validation and deployment

‣ The imminent exhaustion of the IPv4 address space will eventually require to migrate the WLCG services to an IPv6 infrastructure. TF works in close relation with the HEPIX IPv6 Working Group

‣ Agreed at the last F2F that it would be beneficial toprogress with volunteering sites moving to dual stack

‣ trying to understand how to make sure the instability this would cause does not have negative impact on the site (?)

‣ A document for the MB is being prepared, covering also the case of the MW readiness WG

https://twiki.cern.ch/twiki/bin/view/LCG/WlcgIpv6

Page 12: Status of Tasks Forces Josep Flix (PIC/CIEMAT) On behalf of the WLCG Operations Coordination Team

WLCG OpsCoord F2F @ CERN11th February 2014 12

Conclusions

‣ All of the TFs are progressing well!‣ Sites and experiments are encouraged to actively

participate on the discussions and the TFs!

‣ WLCG Operations coordination twiki:https://twiki.cern.ch/twiki/bin/view/LCG/WLCGOpsCoordination‣ Mailing list: [email protected]