Quarterly report ScotGrid Quarter 02 2005 Fraser Speirs.
-
Upload
dorcas-thornton -
Category
Documents
-
view
215 -
download
1
Transcript of Quarterly report ScotGrid Quarter 02 2005 Fraser Speirs.
Quarterly report
ScotGridQuarter 02 2005
Fraser Speirs
2005 Q1 Quarterly report: ScotGrid
Current site status data
Site Service nodes
Worker nodes
Local network connectivity
Site connectivity
SRM Days SFT failed
Days in scheduled maintenance
Security incidents this quarter which impact on Grid
Durham SL3 LCG2.4.0
SL3LCG2.4.0
100Mb/s 1Gb/s No 8 7 0
Edinburgh SL3LCG2.5.0
SL3LCG2.5.0
1Gb/s 1Gb/s dCache (not fully deployed)
34 6 0
Glasgow SL3LCG2.4.0
SL3LCG 2.4.0
1Gb/s 1Gb/s No 27 35 0
1) Local network connectivity is that to the site SE2) It is understood that SFT failures do not always result from site problems, but it is the best measure
currently available.
2005 Q1 Quarterly report: ScotGrid
All GridPP Resources
Site Promised Actual
Average kSI2K
available in this
quarter
CPU (kSI2K)
Storage (TB)
Average kSI2K
available in this
quarter
CPU Storage (TB)
Durham 86 86 5 18 22 2
Edinburgh
7 7 39 2.7 5 19
Glasgow 245 245 12 75 240 1.8
Total 338 338 56 95.7 267 22.8
1) The GridPP-Tier-2 MoUs made reference to integrated CPU over the 3 years of GridPP2. Under the “Promised – integrated kSI2K hours until this quarter” an estimate is provided of what the Tier-2 would have expected to provide to this quarter on the basis of planned installations. “Static kSI2K” shows what would currently be expected if all purchases planned to this quarter had been made and implemented. The actual columns show what has been delivered.
2005 Q1 Quarterly report: ScotGrid
LCG resources
Site Estimated for LCG Currently delivering to LCG
Total job slots
CPU (kSI2K)
Storage(TB)
Total jobs slots
CPU(kSI2K)
Storage (TB)
Durham 87 9 0.5 22 22 2
Edinburgh 5 7 25 5 4 19
Glasgow 249 101 4.8 230 52 1.8
Total 341 115 30 257 31 22.8
1) The estimated figures are those that were projected for LCG planning purposes:http://lcg-computing-fabric.web.cern.ch/LCG-Computing-Fabric/GDB_resource_infos/Summary_Institutes_2004_2005_v11.htm
2) Current total job slots are those reported by EGEE/LCG gstat page.
2005 Q1 Quarterly report: ScotGrid
VOs supported by site
Site ALICE ATLAS Biomed
CMS Sixt dTeam
Zeus LHCb Total
Durham 1 1 0 1 1 1 0 1 6
Edinburgh 1 1 1 1 1 1 0 1 7
Glasgow 1 1 1 1 1 1 1 1 8
Total 3 3 2 3 3 3 1 3
0 => not supported 1 => supported
2005 Q1 Quarterly report: ScotGrid
CPU used per VO over quarter (KSI2K hours)
Site ALICE ATLAS BABAR CMS LHCB ZEUS abc Total
Durham 3 3
Edinburgh 113 2 162 276
Glasgow 3 13 16
Total
1) Information currently available from APELhttp://goc.grid-support.ac.uk/gridsite/accounting/tree/gridpp_view.php - please note these pages are still under development! Nb. This could be automated with an SQL/R-GMA query
2005 Q1 Quarterly report: ScotGrid
Usage by VO for Tier-2
Jobs Apr 2005 May 2005 Jun 2005
alice
atlas 516 235 192
cms 2 1 20
dteam
lhcb 453 1141 257
abc
CPU(KSI2K hours)
Apr 2005 May 2005 Jun 2005
alice
atlas 25 16 75
cms 3
dteam
lhcb 17 86 71
abc
2005 Q1 Quarterly report: ScotGrid
Usage by VO (jobs)
Nb: This can be extracted from APEL
2005 Q1 Quarterly report: ScotGrid
Storage resources in use per VO (TB)
Site Storage ALICE ATLAS CMS dTeam
Lhcb Sixt Zeus Total
Durham 0 0.001 0 12 12.001
Edinburgh 0 0.003 0 0.003
Glasgow 0 0.98 0 0 0 0 0.3 1.28
Total 0.984 12 0.3
Difficult to provide this for the period but we can at least show *current* usage. If we can get the information averageand maximum per VO over the period would be useful parameters to record.
2005 Q1 Quarterly report: ScotGrid
CPU Usage by VO (KSI2K hours)
Nb: This can be extracted from APEL – http://goc.grid-support.ac.uk/gridsite/accounting/custom.php
2005 Q1 Quarterly report: ScotGrid
Progress over last quarter
Site Successes Problems/Issues
Durham •Upgrade to LCG2.4•Significant reliability improvement
•Documentation•Late release of LCG 2.6 meant problems
Edinburgh Upgrade to LCG2.5Participation in SC3Deployment of dCache SRMTrials of DPM
DocumentationLots of difficulty with dCache - mostly solved but v. Time consuming
Glasgow Deployment of all CPU resourcesPre-production deployment of DPM SRM
DocumentationAvailability of staff to troubleshootLate release of 2.6 caused problems
2005 Q1 Quarterly report: ScotGrid
Tier-2 risks
General risks
•Lack of documentation for middleware threatens meeting MoU commitments.•Concern over migration strategy from Classic SE to dCache•Feel that scheduled LCG release plan has been abandoned•Possible resistance to Scientific Linux for future shared cluster at Glasgow
Mitigating actions
•Using GOC Wiki entries as substitute•Writing ‘experience’ documentation•None known. Tool support and migration strategy is required.•Inevitably lowers priority of doing upgrades - can’t just drop everything and upgrade.•Virtualisation?
Institute specific risks
• Some concern over reinvestment at Durham
Mitigating actions
• Attempts to secure further funding are ongoing.
2005 Q1 Quarterly report: ScotGrid
Tier-2 planning for next quarter
• Maintaining presence on grid• Complete DPM SRM deployment at Glasgow, Durham• Reliability/metrics a focus• Focus on team communication and coordination at Glasgow
(see: http://www.scotgrid.ac.uk/wiki)• Better internal monitoring of cluster performance and uptime
(Ganglia/Nagios)
2005 Q1 Quarterly report: ScotGrid
Objectives and deliverables for last
quarter
Objective/deliverable Due date Status
All sites to 2.4.0 April 1st (or release date) + 3 weeks
Done
dCache deployed at Glasgow April 1st + 3 weeks Not done - choosing DPM
dCache deployed at Edinburgh April 1st + 3 weeks Done, although dCache issues delayed
Increase disk space at Glasgow End Q2 Not done - awaiting SRM deployment
Refurbishment of server room at Glasgow
End Q2 Done
dCache deployed at Durham None set Not done - waiting on results of Glasgow DPM evaluation
Continue planning for network upgrades in respect of service challences
None set Ongoing. Using experience of SC3 at Edinburgh to guide.
2005 Q1 Quarterly report: ScotGrid
Objectives and deliverables for next
quarter
Objective/deliverable Due date Metric/output
All sites to 2.6.0 Set by ROC
DPM deployed at Glasgow End Q2
SRM deployed at Durham End Q2
Full Ganglia implementation across T2 End August
Continue planning for network upgrades in respect of service challences
None set SC4-capable network connectivity
2005 Q1 Quarterly report: ScotGrid
Meetings, papers & effort
Tier-2 coordinator effort Comments
3.0
Area Description
Talks Scotgrid status - GridPP13
Conferences •GridPP13LCG Ops Workshop - BolognaEGEE 3 - Athens
Publications
For Tier-2 coordinator:
2005 Q1 Quarterly report: ScotGrid
Summary & outlook
• Good progress this quarter on resource deployment, especially Glasgow CPU and Edinburgh disk.
• Progress on SRM deployment promising, although we still need a story about migration from Classic SE.
• Improvement in team coordination at Glasgow
• Outlook is good for hardware refresh at Glasgow, Edinburgh• Lack of enthusiasm for Scientific Linux across Glasgow’s local
userbase leads us to believe that there is a pressing need to research and solve the problem of LCG co-existence inside a shared cluster. (Portability/Xen?)
• Need to find ways to make SFT results match reality. Currently, they make the situation look worse than it is because of full queues.