Gretchen Greene & Perry Greenfield

12
Data Management Subsystem: Data Processing, Calibration and Archive Systems for JWST with implications for HST Gretchen Greene & Perry Greenfield

description

Data Management Subsystem: Data Processing, Calibration and Archive Systems for JWST with implications for HST. Gretchen Greene & Perry Greenfield. Outline. Overview of general JWST approach to processing and archiving data Overview of Archive interface approach How this will affect HST. - PowerPoint PPT Presentation

Transcript of Gretchen Greene & Perry Greenfield

Page 1: Gretchen Greene & Perry Greenfield

Data Management Subsystem:Data Processing, Calibration and Archive Systems for JWST

with implications for HST

Gretchen Greene & Perry Greenfield

Page 2: Gretchen Greene & Perry Greenfield

Outline

• Overview of general JWST approach to processing and archiving data

• Overview of Archive interface approach• How this will affect HST

Page 3: Gretchen Greene & Perry Greenfield

JWST Pipelines

• New approach, new infrastructure.• OWL (Open Workflow Layer) replaces OPUS as

processing framework– Built upon Condor– While enabling much better management of pipelines and

facilitates reprocssing, not generally visible to end users• BAR (Background Automated Reprocessing)

– Replaces On-The-Fly-Reprocessing system used for HST– Significant effect on how users see and get data.

• Calibration Reference Data System (CRDS) replaces CDBS

Page 4: Gretchen Greene & Perry Greenfield

JWST Archive

• Archive databases centered on Common Archive Object Model (CAOM) developed by CADC.– Allows MAST to present more consistent views of

data between different telescopes, instruments, and missions

– Centered around consistent terminology for fields and shared user interface

• Will use new Single Sign-On systems– Users will need only one account and password for

all STScI systems that require one.

Page 5: Gretchen Greene & Perry Greenfield

Background Automated Reprocessing

• New model being adopted• Outline of general approach:– Repeated batch processing– Keep reprocessed data online for quick retrieval– If request for data slated for reprocessing comes in, data

moved up in the queue to make reprocessed version available more quickly (available in 2015)• User has option of immediately retrieving stale version, or waiting

for reprocessed version

– Offers benefits of quick retrieval and existing OTFR system

Page 6: Gretchen Greene & Perry Greenfield

BAR (cont.)

• Criteria for reprocessing:– Reprocess only data that are affected by changes in:

• Input data (e.g., changed keyword values in raw data)• Reference files used (as recommended by CRDS)• Changes in calibration software

• Reprocessing timing is quasi-automatic– All reference file updates trigger reprocessing of affected files– But reprocessing may be held off if more reference file updates

expected soon (within a week or two),• aside from daily darks, biases, etc., where reprocessing only affects a small

number of datasets.

– Calibration software changes trigger broader reprocessing for a given instrument• May only reprocess subset of data if affected subset is clear (e.g., only

affects a specific mode).

Page 7: Gretchen Greene & Perry Greenfield

Calibration Reference Data System

• New architecture for recommending reference files• Encapsulates whole history of rules in different configurations

– I.e., can recreate previous recommendations even after updates to system

• Benefits:– Different pipelines can run with different rules configurations simultaneously;

makes testing much easier– Reference file machinery can be run on off-site computers (e.g., at observer’s

home institution) as part of calibration pipeline itself• Checks to see if it has all needed rules files for making recommendation; if not, it downloads

them.• Checks to see if it has all needed reference files after recommendations are obtained; if not,

automatically downloads reference files to user’s computer• Should make running and rerunning calibration pipelines at home institution much easier.• Users can customize rules to use their own reference files• Users can control the configuration of the rules (e.g., to ensure calibration of all data is done

consistently)• Scripts will be available so that users can update local datafile headers with reference file

recommendations for off-net reprocessing.

Page 8: Gretchen Greene & Perry Greenfield

What does this mean for HST?

• NASA has funded upgrading the HST Data Management System to use OWL, BAR and CRDS.

• CRDS is already being used by HST– Though for HST, no obvious impact to users yet.– With future scripts, it will be much easier to update

reference file recommendations off-site, and automatically download needed reference files.

• OWL and BAR to start being used by fall 2014.– Some functionality will be added in 2015.– Online cache phased in, instrument by instrument.– HST users will get all the benefits of automatic reprocessing

that JWST will!

Page 9: Gretchen Greene & Perry Greenfield

MAST

• Discovery portal is to be used for both missions.– First deployed Nov 2103 and updated July 2014– Search by coordinates, singly and from a list of uploaded list of targets, along

with selection filters and sorting capability.– Multiple download options (wget, curl, sftp)

• Migrating to use on-line cache of calibrated data BAR will produce.– Initially, proprietary data continues to be accessed through DADS interface

• Developing search capability on any field part of CAOM model of HST data– initially searches all of MAST; later will have HST/JWST restricted searches)

• MAST integrating Single Sign On capabilities over next 6 months– Allows ability of authorized users granting specific users access to proprietary

data and customize portal features (to come later)

• Hubble Source Catalog planned for release early 2015 along with portal interface and plotting updates to facilitate use of the HSC.

Page 10: Gretchen Greene & Perry Greenfield

MAST Data Discovery Tool entry page

S&OC System Design Review #1

• Select collection• Select target• Upload target list

Page 11: Gretchen Greene & Perry Greenfield

MAST Portal – Multi-mission Data Discovery Interface

• Initial components: collection selector, search box, filter options, search results, sky viewer

Page 12: Gretchen Greene & Perry Greenfield

Using MAST AstroView Sky Viewer

• Sky visualization• Selectable overlay graphics

– Observed footprints– Source catalogs

• Multiple background image surveys (Digitized Sky Survey, GALEX, Sloan)

• Full sky zoom and pan