Data management demonstrators

7
Data management demonstrators Ian Bird; WLCG MB 18 th January 2011

description

WLCG MB 18 th January 2011. Data management demonstrators. Ian Bird; . Reminder. March 2010 Experiments express concern over Data management and access First brainstorming … agree on Jamboree June 2010 Amsterdam Jamboree ~ 15 demonstrator projects proposed Process: - PowerPoint PPT Presentation

Transcript of Data management demonstrators

Page 1: Data management demonstrators

Data management demonstrators

Ian Bird;

WLCG MB18th January 2011

Page 2: Data management demonstrators

[email protected] 2

• March 2010– Experiments express concern over Data management and access– First brainstorming … agree on Jamboree

• June 2010– Amsterdam Jamboree– ~15 demonstrator projects proposed– Process:

• Follow up at WLCG meeting – ensure projects had effort and interest• Follow up in GDB’s• By end of year – decide which would continue based on demonstrated

usefulness/feasibility

• 2nd half 2010– Initial follow up at WLCG workshop– GDB status report

• Jan 2011– GDB status report (last week)– Close of process started in Amsterdam (today)

Reminder

Page 3: Data management demonstrators

[email protected] 3

• 12 projects scheduled at GDB– 2 had no progress reported (CDN + Cassandra/Fuse)– 10 either driven, or interest expressed, by experiments

• Assume these 10 will progress and be regularly reported on in GDB

• Scope for collaboration between several– To be encouraged/pushed

• Several using xrootd technology– Must ensure we arrange adequate support

• Which (and how) to be wrapped into WLCG sw distributions?• Process and initiatives have been very useful

– MB endorses continued progress on these 10 projects

My Summary of demonstrators

Page 4: Data management demonstrators

[email protected] 4

• ATLAS PD2P– In use; linked to LST demonstrator– Implementation is ATLAS-specific, but ideas can be re-used with other central task-queues

• ARC Caching– Use to improve ATLAS use of ARC sites, could also help others use ARC– More general use of cache – needs input from developers (and interest/need from

elsewhere)• Speed-up of SRM getturl (make sync)

– Essentially done, but important mainly if lots of small files (then other measures should be taken too)

• Cat/SE sync + ACL propagation with MSG– Prototype exists– Interest in testing from ATLAS– Ideas for other uses

• CHIRP– Seems to work well – use case of personal SE (grid home dir)– Used by ATLAS, tested by CMS

Summary … 1

Page 5: Data management demonstrators

[email protected] 5

• Xrootd-related– Xrootd (EOS, LST)

• Well advanced, tested by ATLAS and CMS• Strategy for Castor evolution at CERN

– Xrootd – ATLAS• Augments DDM • Commonality with CMS

– Xrootd-global – CMS• Global xrootd federation, integrates with local SE’s and FS’s

– Many commonalities – can we converge on a common set of tools??• Proxy-caches in root

– Requires validation before production– Continue to study file caching in experiment frameworks

• NFS4.1– Lot of progress in implementations and testing– Needs some xrootd support (CMSD)– Should MB push for pNFS kernel in SL?

Summary … 2

Page 6: Data management demonstrators

[email protected] 6

• Amsterdam was not only the demonstrators

• Was a recognition that the network was a resource– Could use remote access– Should not rely on 100% accuracy of catalogues, etc.– Can use network to access remote services– Network planning group set up– Work with NRENs etc is ongoing (they got our message)

• Also understood where data management model should change– Separate tape and disk caches (logically at least)– Access to disk caches does not need SRM– SRM for “tape” can be minimal functionality– Disk to be treated as a cache – move away from data placement for analysis at

least– Re-think “FTS”– Etc.

Post – Amsterdam

Page 7: Data management demonstrators

[email protected] 7

• We have changed direction• There are a number of very active efforts –

driven by experiment need/interests for future– Not just what was in the demonstrators

• Should continue to monitor and support – and look for commonalities: opportunity to reduce duplication and improve support efforts

Conclusions