The French Data Access Center - Large Synoptic Survey Telescope · • Start some exploratory work...

17
1 Dominique Boutigny The French Data Access Center LSST@Europe2 – Belgrade June 20 th - 24 th Thanks to Fabio Hernandez for helping in preparing this talk

Transcript of The French Data Access Center - Large Synoptic Survey Telescope · • Start some exploratory work...

Page 1: The French Data Access Center - Large Synoptic Survey Telescope · • Start some exploratory work with Apache Spark – Scan 9M fits image and extract HDU – Store metadata in a

1

Dominique Boutigny

The French Data Access CenterLSST@Europe2 – Belgrade June 20th - 24th

Thanks to Fabio Hernandez for helping in preparing this talk

Page 2: The French Data Access Center - Large Synoptic Survey Telescope · • Start some exploratory work with Apache Spark – Scan 9M fits image and extract HDU – Store metadata in a

2Comité Scientifique - LAPP, 8 juin 2016 Dominique Boutigny 2

MOA signed between CNRS / IN2P3 and LSST in 2015 :• CC-IN2P3 will process 50% of the level-2 data : Satellite Data Release Processing• A complete copy of the data (raw + catalogs) will be available at CC-IN2P3

Dedicated 20 Gb/s already deployed

Page 3: The French Data Access Center - Large Synoptic Survey Telescope · • Start some exploratory work with Apache Spark – Scan 9M fits image and extract HDU – Store metadata in a

The IN2P3 Computing Center – CC-IN2P3

Page 4: The French Data Access Center - Large Synoptic Survey Telescope · • Start some exploratory work with Apache Spark – Scan 9M fits image and extract HDU – Store metadata in a

4

The IN2P3 Computing Center

CC-IN2P3 is the main French computing center for HEP – Nuclear Physics and Astroparticles

• ~65 computing engineers - ~11 M€ / year (7 M€ equipment & operation – 4 M€ salaries)

• Supporting ~70 science groups

• Tier-1 center for the 4 LHC experiments– ~10 % of LHC Tier 1 computing needs 

• 13,000 cores – 20 PB disk storage – Hierarchical storage system HPSS – 340 PB tape capacity

• 2 computer rooms – The most recent one (2011) : state of the art design 

• Redundancy - Highly modular / expandable

Page 5: The French Data Access Center - Large Synoptic Survey Telescope · • Start some exploratory work with Apache Spark – Scan 9M fits image and extract HDU – Store metadata in a
Page 6: The French Data Access Center - Large Synoptic Survey Telescope · • Start some exploratory work with Apache Spark – Scan 9M fits image and extract HDU – Store metadata in a

Spare room was foreseen for LSST computing when designing the new computer room in 2011…

Corresponding spare room in technical facilities (electricity & cooling)==> Can increase power up to 6-7 MW without major problem

Page 7: The French Data Access Center - Large Synoptic Survey Telescope · • Start some exploratory work with Apache Spark – Scan 9M fits image and extract HDU – Store metadata in a

7

CC-IN2P3 – NCSA collaboration

• Joint Coordination Committee is in place and running (IN2P3 – NCSA – LSST-DM)– LSST infrastructure @CC-IN2P3 should be fully integrated to the LSST DRP

• Not necessarily the same hardware• But full compatibility is required

• Explore standard transport protocols for bulk file transfer over high-latency network links– Goal: understand what will be possible to use for transferring data between CC-

IN2P3 and NCSA for the 2020-2030 era– Test HTTP2 as an underlying transport protocol

• Test Object Store technologies to determine if they are suitable for LSST use cases– OpenStack Swift and CEPH– Considering a partnership with a vendor

Page 8: The French Data Access Center - Large Synoptic Survey Telescope · • Start some exploratory work with Apache Spark – Scan 9M fits image and extract HDU – Store metadata in a

8

Qserv test platform

Thanks to a partnership with Dell we have deployed a Qserv test bench

The only test bench currently available in LSST for large scale tests

● 50 nodes - 400 cores● 800 GB memory● 500 TB disk storageWill add non volatile memory devices to front end machines

Current tests : 35 TBNext step 120 TB (Pan-STARRS)

Page 9: The French Data Access Center - Large Synoptic Survey Telescope · • Start some exploratory work with Apache Spark – Scan 9M fits image and extract HDU – Store metadata in a

9

Other ongoing R&D topics

• LSST binary software distribution in the cloud – CernVM-FS– https://github.com/airnandez/lsst-cvmfs

• Explore how to run LSST standard tasks using Docker containers and how to build workflows– Goal: understand how to build portable containers runnable

everywhere

• Study I/O patterns induced by LSST software framework– https://github.com/airnandez/cluefs

• Start some exploratory work with Apache Spark– Scan 9M fits image and extract HDU– Store metadata in a database (MySQL, MongoDB)

Page 10: The French Data Access Center - Large Synoptic Survey Telescope · • Start some exploratory work with Apache Spark – Scan 9M fits image and extract HDU – Store metadata in a

10

Image processing @CC-IN2P3• Strong participation in 2013 in the SDSS Stripe 82

reprocessing together with NCSA– 1 M HS06.hours – 100 TB disk storage – 4.4 million input files– A total of 19 million files at the end of the DC– Validated the principle of the Satellite DRP

• DESC Reprocessing Task Force– CFHT / Megacam

• Galaxy clusters• CFHTLS Deep fields / SN1a Image differencing

– Now running (almost) routinely– Plan to ingest catalogs in the Qserv instance running @CC-IN2P3

Page 11: The French Data Access Center - Large Synoptic Survey Telescope · • Start some exploratory work with Apache Spark – Scan 9M fits image and extract HDU – Store metadata in a

11

MACSJ2243.3-0935 (i – r – g )

Page 12: The French Data Access Center - Large Synoptic Survey Telescope · • Start some exploratory work with Apache Spark – Scan 9M fits image and extract HDU – Store metadata in a

12

Toward a Data Access Center

• Need to think about the next steps– We will have the data and we need to provide access to them– We want to set up an infrastucture adequate for science / analysis ? 

• Such an infrastructure will have a cost – Investment– Running cost / year

• Need to define the exact scope of the French DAC– Combination between the LSST DAC as defined in LDM-230 and an

Analysis Facility(ies)– Some components and use cases of the analysis facility have been

defined in the DESC Computing Infrastructure working groups

Page 13: The French Data Access Center - Large Synoptic Survey Telescope · • Start some exploratory work with Apache Spark – Scan 9M fits image and extract HDU – Store metadata in a

13

Scope of the DAC2 optons : • DAC serving the French community

– Minimum option to guarantee the scientific return– Analysis Facility will be focused on the relevant scientific

topics • IN2P3 --> DESC (for sure !)• ...

• Extend to serve the broader European LSST community– Requires a coordination at the European Level

• H2020 European funding requires at least 3 participating countries

Page 14: The French Data Access Center - Large Synoptic Survey Telescope · • Start some exploratory work with Apache Spark – Scan 9M fits image and extract HDU – Store metadata in a

14

DAC Components● Data repository

➢ Store and serve level-2 / level-3 data + user data● Catalog of Astronomical Objects a.k.a. Qserv● Data distribution system

➢ To download datasets to local facilities / laptop● User Interface and visualization

➢ Will mainly rely on the services developped at IPAC

● HTC farm● Parallel / large memory farm

➢ Large cosmological fits (for DESC)

=> Also need to specify the communication channels to be set up between all the components (this is the tricky part...)

“Standard”LSST DAC

Analysis Facility

Page 15: The French Data Access Center - Large Synoptic Survey Telescope · • Start some exploratory work with Apache Spark – Scan 9M fits image and extract HDU – Store metadata in a

15

• Everything is open• We want to be as flexible as possible• The only constraint is that we want to stick as much

as possible to the LSST DM choices• We need input in order to specify and size the

facility• The design of the final platform(s) will depend a lot

on the communities that it will serve

Page 16: The French Data Access Center - Large Synoptic Survey Telescope · • Start some exploratory work with Apache Spark – Scan 9M fits image and extract HDU – Store metadata in a

16

Next steps ?

• Having a full copy of the LSST data in Europe (Level-2 + pixels) is a real opportunity to set up something ambitious

• We are of course fully open to collaborate with our European LSST colleagues– Notice : a European DAC can be multisite

• With a coherent project we can certainly apply for and get European fundings– EINFRA-12-2017 « Data and Distributed Computing e-

infrastructures for Open Science »– 8–10 M€ - Single step call – close on March 29th, 2017

Note : LSST is already part of the ASTERICS EU project

Page 17: The French Data Access Center - Large Synoptic Survey Telescope · • Start some exploratory work with Apache Spark – Scan 9M fits image and extract HDU – Store metadata in a

17

Workshop announcement

Getting ready for doing science with LSST data

• Sponsored by the LSSTC enabling Science Committee• Organized together with the LSST DM• Financial support for young scientist will be provided

To be held in Lyon (France) in May or June 2017