David Britton, 2/9/03 Imperial, College GridPP Project Overview David Britton, Imperial College.
GridPP23 – Final Steps to Data David Britton, 8/Sep/09.
-
Upload
maurice-lindsey -
Category
Documents
-
view
212 -
download
0
Transcript of GridPP23 – Final Steps to Data David Britton, 8/Sep/09.
GridPP23 – Final Steps to Data
David Britton, 8/Sep/09
2
Since GridPP22 in April…
• Validated the UK infrastructure with STEP09.
• Moved the Tier-1 to R89.• Procured new hardware.• Exercised our disaster
management process (several times!)
8/Sep/09
… and before GridPP24 at RHUL we will have data.
3
WLCG Growth
March 2009 September 2009
>315,000 KSI2K
4
UK CPU Contribution
Same picture if non-LHC VOs included
8/Sep/09
58/Sep/09
UK Site Contributions
2007 – 8 - 9
NorthGrid: 34 – 22 - 15%
London: 28 – 25 - 32%
ScotGrid: 18 – 17 - 22%
Tier-1: 13 – 15 - 13%
SouthGrid: 7 – 16 - 13%
GridIreland: 0 – 6 - 5%
6
UK Site Contributions: Non LHC VOs
8/Sep/09
7
CPU Efficiencies (CPU/Wall Time)
8/Sep/09
8
StorageGstat gives:
8/Sep/09
September 2008 March 2009 September 2009
… the last set could actually be sensible!
9
Data Transfers
8/Sep/09
10
OPN Resilience
8/Sep/09
11
STEP09Operations Report at wLCG MB; 16/Jun
The lack of “hero-mode” is a direct consequence of all the (heroic) effort that has been put in over the last year to make the UK Grid more resilient.8/Sep/09
12
More STEP09 Highlights
• I won’t preempt (too much) the upcoming talks…
– RAL was the best ATLAS Tier-1 after the BNL ATLAS-only Tier-1
– Glasgow ran more jobs then any of the 50-60 ATLAS Tier-2 sites throughout the world.
– Most Tier-2 sites made good contributions and many gained valuable insight into tuning issues during STEP09 and subsequent testing.
– “The responsiveness of RAL to CMS during STEP09 was in stark-contrast to many other Tier-1s.”
– CMS noted the tape performance at RAL was very good as was the CPU efficiency.
– Many (if not all) the metrics for the experiments were met, and in some cases, significantly exceeded at RAL during STEP09.
8/Sep/09
13
In the end, hand-over was delay from Dec to Apr 09. Hardware was delayed but we were (almost) rescued by the LHC schedule change. Minor (?) issues remain with R89 (Aircon-trips; water-proof membrane?)
(GridPP22) Current Issues: R89
8/Sep/09
14
Tier-1 Hardware
• The FY2008 hardware procurement had to await the acceptance of R89.
• The CPU is tested, accepted, and being deployed (14,000 HEPSPEC06)
• The disk procurement (2.2 PB) was split into two halves (different disks and controllers to mitigate against acceptance risk). This has proved sensible, as one batch has demonstrated ejection issues.
• One half of the disk is being deployed; progress is being made on the other half and best guess is deployment by end of November.
• A second SL85000 tape robot is available.• The FY09 hardware procurement is underway.
8/Sep/09
15
Disaster Management
• A four-stage disaster management process was established at the Tier-1 earlier this year as part of our focus on resilience and disaster management.
• Designed to be used regularly so that process is familiar. This means low-threshold to trigger Stage-1 “disasters”
• At Stage-3, the process formally involves stake-holders outside the Tier-1, including GridPP management. This has now happened several times including:– R89 aircon trip– R89 water leak– Disk procurement problem– Swine flu planning.
• The process is still being honed, but I believe it is very useful.
8/Sep/09
16
Tier-2 Performance
Resource-weighted averages
8/Sep/09
17
Tier-2 Resources
1/Apr/098/Sep/09
18
- NGI
- NGI
- NGI
EGI/NGI
EGI
UK-NGI
Coordinating body in Amsterdam
National initiatives in member countries
GridPP
NGS
Involves STFC, EPSRC and JISC (at least) in the UK.
EGI is vital to GridPP but it is not GridPP’s core business to run an e-science infrastructure for the whole of the UK: seek a middle ground.
8/Sep/09
19
Jigsaw Puzzle
SSC EMI
EGIHeavy Users
SSCSSC
(Roscoe) UnicoreARC
gLiteUK involvement with Ganga?
UK involvement via the UK NGI with global tasks such as GOGDB, security, dissemination, training....
UK involvement with APEL, GridSite? …
8/Sep/09
20
Next Steps
• Oversight Committee meeting next week.– Approval for OPN resilient link– Confirmation of remaining GridPP3 spending profile– Some guidance on GridPP4?
• The LHC start-up, round-2 (Roger Bailey’s talk next!)• Moving towards a UK NGI in the perspective of EGI,
SSC’s, EMI, etc. (Monologue by John Cleese: “There will be a certain degree of uncertainty, of that we can be quite (long pause) … sure.”)
• Shaking down R89; Settling down for a long run.• Tier-2 hardware allocations.• GridPP4• … and data!
8/Sep/09
21
Summary and the Future
LHC Data
Oh god!
• STEP09 validated the UK infrastructure for LHC data-taking and proved that we are in good shape.
• We are building on this with careful tuning and further improvements to resilience and management processes.
• Great care must be taken not to invalidate the validation (but we cannot sit still either).
Hand of god?
Thoroughly deserved team effort which did not require (much) divine intervention.
8/Sep/09