The French Data Access Center - Large Synoptic Survey Telescope · • Start some exploratory work...
Transcript of The French Data Access Center - Large Synoptic Survey Telescope · • Start some exploratory work...
1
Dominique Boutigny
The French Data Access CenterLSST@Europe2 – Belgrade June 20th - 24th
Thanks to Fabio Hernandez for helping in preparing this talk
2Comité Scientifique - LAPP, 8 juin 2016 Dominique Boutigny 2
MOA signed between CNRS / IN2P3 and LSST in 2015 :• CC-IN2P3 will process 50% of the level-2 data : Satellite Data Release Processing• A complete copy of the data (raw + catalogs) will be available at CC-IN2P3
Dedicated 20 Gb/s already deployed
The IN2P3 Computing Center – CC-IN2P3
4
The IN2P3 Computing Center
CC-IN2P3 is the main French computing center for HEP – Nuclear Physics and Astroparticles
• ~65 computing engineers - ~11 M€ / year (7 M€ equipment & operation – 4 M€ salaries)
• Supporting ~70 science groups
• Tier-1 center for the 4 LHC experiments– ~10 % of LHC Tier 1 computing needs
• 13,000 cores – 20 PB disk storage – Hierarchical storage system HPSS – 340 PB tape capacity
• 2 computer rooms – The most recent one (2011) : state of the art design
• Redundancy - Highly modular / expandable
Spare room was foreseen for LSST computing when designing the new computer room in 2011…
Corresponding spare room in technical facilities (electricity & cooling)==> Can increase power up to 6-7 MW without major problem
7
CC-IN2P3 – NCSA collaboration
• Joint Coordination Committee is in place and running (IN2P3 – NCSA – LSST-DM)– LSST infrastructure @CC-IN2P3 should be fully integrated to the LSST DRP
• Not necessarily the same hardware• But full compatibility is required
• Explore standard transport protocols for bulk file transfer over high-latency network links– Goal: understand what will be possible to use for transferring data between CC-
IN2P3 and NCSA for the 2020-2030 era– Test HTTP2 as an underlying transport protocol
• Test Object Store technologies to determine if they are suitable for LSST use cases– OpenStack Swift and CEPH– Considering a partnership with a vendor
8
Qserv test platform
Thanks to a partnership with Dell we have deployed a Qserv test bench
The only test bench currently available in LSST for large scale tests
● 50 nodes - 400 cores● 800 GB memory● 500 TB disk storageWill add non volatile memory devices to front end machines
Current tests : 35 TBNext step 120 TB (Pan-STARRS)
9
Other ongoing R&D topics
• LSST binary software distribution in the cloud – CernVM-FS– https://github.com/airnandez/lsst-cvmfs
• Explore how to run LSST standard tasks using Docker containers and how to build workflows– Goal: understand how to build portable containers runnable
everywhere
• Study I/O patterns induced by LSST software framework– https://github.com/airnandez/cluefs
• Start some exploratory work with Apache Spark– Scan 9M fits image and extract HDU– Store metadata in a database (MySQL, MongoDB)
10
Image processing @CC-IN2P3• Strong participation in 2013 in the SDSS Stripe 82
reprocessing together with NCSA– 1 M HS06.hours – 100 TB disk storage – 4.4 million input files– A total of 19 million files at the end of the DC– Validated the principle of the Satellite DRP
• DESC Reprocessing Task Force– CFHT / Megacam
• Galaxy clusters• CFHTLS Deep fields / SN1a Image differencing
– Now running (almost) routinely– Plan to ingest catalogs in the Qserv instance running @CC-IN2P3
11
MACSJ2243.3-0935 (i – r – g )
12
Toward a Data Access Center
• Need to think about the next steps– We will have the data and we need to provide access to them– We want to set up an infrastucture adequate for science / analysis ?
• Such an infrastructure will have a cost – Investment– Running cost / year
• Need to define the exact scope of the French DAC– Combination between the LSST DAC as defined in LDM-230 and an
Analysis Facility(ies)– Some components and use cases of the analysis facility have been
defined in the DESC Computing Infrastructure working groups
13
Scope of the DAC2 optons : • DAC serving the French community
– Minimum option to guarantee the scientific return– Analysis Facility will be focused on the relevant scientific
topics • IN2P3 --> DESC (for sure !)• ...
• Extend to serve the broader European LSST community– Requires a coordination at the European Level
• H2020 European funding requires at least 3 participating countries
14
DAC Components● Data repository
➢ Store and serve level-2 / level-3 data + user data● Catalog of Astronomical Objects a.k.a. Qserv● Data distribution system
➢ To download datasets to local facilities / laptop● User Interface and visualization
➢ Will mainly rely on the services developped at IPAC
● HTC farm● Parallel / large memory farm
➢ Large cosmological fits (for DESC)
=> Also need to specify the communication channels to be set up between all the components (this is the tricky part...)
“Standard”LSST DAC
Analysis Facility
15
• Everything is open• We want to be as flexible as possible• The only constraint is that we want to stick as much
as possible to the LSST DM choices• We need input in order to specify and size the
facility• The design of the final platform(s) will depend a lot
on the communities that it will serve
16
Next steps ?
• Having a full copy of the LSST data in Europe (Level-2 + pixels) is a real opportunity to set up something ambitious
• We are of course fully open to collaborate with our European LSST colleagues– Notice : a European DAC can be multisite
• With a coherent project we can certainly apply for and get European fundings– EINFRA-12-2017 « Data and Distributed Computing e-
infrastructures for Open Science »– 8–10 M€ - Single step call – close on March 29th, 2017
Note : LSST is already part of the ASTERICS EU project
17
Workshop announcement
Getting ready for doing science with LSST data
• Sponsored by the LSSTC enabling Science Committee• Organized together with the LSST DM• Financial support for young scientist will be provided
To be held in Lyon (France) in May or June 2017