Distributed Computing Operations
Stefan RoiserLHCb Computing Operations Workshop
27 Jan ‘15
Dist Comp Operations - StR 2
Content
• Roles– Shifters, GEOC, “LHCb 3rd line”
• Some ideas for Operations in Run2– LHCb Computing Operations Meeting– “SCRUM” in Operations?
• DIRAC/SAM jobs
27 Jan '15
Dist Comp Operations - StR 3
Shifters
• What we did in the last ½ year– Shifters only assigned during “heavy campaigns”
• e.g. Stripping 21
– Was this useful? Shall we continue like this? • Prompt reco is certainly also a “heavy campaign”
• Some ideas on how to improve the situation– Use the new web portal as central piece of info
• Shall contain ALL information relevant for shifters• To be maintained by the WLCG Comp Ops Coordinator
– Changes done centrally will propagate to “subscribers” of the plots
– Scratch everything else• www.lhcb-shifters.cern.ch, twiki,
27 Jan '15
Dist Comp Operations - StR 4
Example of new shifters page
• How to import it– In “Settings” change role to “lhcb_shifter” – “Applications” -> “Public State Manager” -> “Desktops” -> “Shared
Desktops” -> “Shifter_Overview” -> “Load”
27 Jan '15
Every page include help -> “?”
Every “help” page has sections:• “Introduction”
– How this page is organized• “Plots explained”
– Detailed explanation of each (group of) plots
• “What to look for”– Hints on possible errors to
check• “Additional Info”
Dist Comp Operations - StR 5
The GEOC role
• Central point of info for Distributed Computing Operations– Is the first contact point for everything that concerns
LHCb Distributed Computing Operations• Manages and possibly solves all issues • May involve others: relay to “LHCb 3rd line support or ask
shifters for help
– Receives info from Shifters, Sites, Production team, LHCb/DIRAC developers
– Provides info to Sites, WLCG Services– Organizes, participates to LHCb/WLCG meetings
27 Jan '15
Dist Comp Operations - StR 6
More ideas for GEOCs
• Shall we have a “handover time”, e.g. shift for 8 days where Monday we have 2 GEOCs
• More involvement into WLCG Ops, e.g. attending “WLCG Operations Coordination meeting”?
• Do we need an “Organogram” of LHCb Computing? – Who is responsible for what in the “3rd line”?
• More involvement of the GEOC in other operational tasks? – E.g. production closing
27 Jan '15
Dist Comp Operations - StR 7
The “LHCb Computing Operations” meeting
• Currently held – Mo, We, Thu @ 11.30 in CERN/2-R-14– Organized by GEOC
• Do we need changes in Run2?– Come back to a daily meeting? – Meeting time, can we have it earlier?– Do we need video in the room?
27 Jan '15
Dist Comp Operations - StR 8
Trello• Lightweight tool to
– Followup on daily operational tasks– All operations people to be member of a “shared board”– We can assign people, give deadlines, categorize, track progress– Proposal: The GEOC of the week keeps the overview of the board, e.g.
during/after “Ops meeting” creates new assignments• Everybody assigned is responsible for moving “his task” through the different
states
27 Jan '15
Dist Comp Operations - StR 9
How to go on from here …
• We now have a good starting point for LHCb Distributed Computing Operations
• Create repository of slides and link it from Dirac web portal
• Keep the slides up to date -> responsibility of everybody– Will ask to go through / update slides approx every ½ year,
to be organized by “Computing Operations Coordinator” – Do we need to have things spelled out? E.g. written down
docu, twiki?
27 Jan '15
Dist Comp Operations - StR 10
Thanks to everybody for contributing to the meeting with slides, ideas, discussions, … !!!!
27 Jan '15
Top Related