Data Ingest Automation GHRC Status and Plans Helen Conover GHRC DAAC Operations Manager...

7
Data Ingest Automation GHRC Status and Plans Helen Conover GHRC DAAC Operations Manager [email protected] Presented at ESIP Summer Meeting 2015

Transcript of Data Ingest Automation GHRC Status and Plans Helen Conover GHRC DAAC Operations Manager...

Page 1: Data Ingest Automation GHRC Status and Plans Helen Conover GHRC DAAC Operations Manager hconover@itsc.uah.edu Presented at ESIP Summer Meeting 2015.

Data Ingest AutomationGHRC Status and Plans

Helen ConoverGHRC DAAC Operations [email protected]

Presented at ESIP Summer Meeting 2015

Page 2: Data Ingest Automation GHRC Status and Plans Helen Conover GHRC DAAC Operations Manager hconover@itsc.uah.edu Presented at ESIP Summer Meeting 2015.

Data Ingest and Publication Issues – Field Campaigns

• Many small and varied datasets delivered within a short time period

• Data formats and metadata vocabularies often specific to user community

• PIs may not have funding to provide full documentation, ATBD, etc.

• Real time acquisition and distribution of ancillary datasets may be required to support field phase of a campaign

15 June 2015ESIP Summer Meeting 2015 2

Page 3: Data Ingest Automation GHRC Status and Plans Helen Conover GHRC DAAC Operations Manager hconover@itsc.uah.edu Presented at ESIP Summer Meeting 2015.

Field Campaign Data Categories and Levels of Service

15 June 2015ESIP Summer Meeting 2015 3

Publication Priority

Data Category

Skinny Catalog (metrics)

Full Catalog (search, landing page)

DOI and Citation

README

Guide

1NASA research instruments (airborne or ground, with NASA-sponsored PI) b b b b

2Affiliated researcher instruments (e.g., PI from partner university) b b b b

3Other agency research instruments (e.g., PI sponsored by NOAA, DOE) b ? b

4Ancillary research data (e.g., PERSIANN model, TRMM flood maps) b b ? b

5Other agency operational data (e.g., GOES imagery, NWS radar, USGS stream gauges) b b ?

Page 4: Data Ingest Automation GHRC Status and Plans Helen Conover GHRC DAAC Operations Manager hconover@itsc.uah.edu Presented at ESIP Summer Meeting 2015.

Current Data Ingest and PublicationInformation Flow

15 June 2015ESIP Summer Meeting 2015 4

Science Pis• Proposed dataset title• Brief description• Basic metadata• Citation author list

• Data, imagery, documents, software (optional)

GHRC Staff• Final dataset title• GCMD keywords• Full metadata• DOI• Guide document text

Master Info Repository

Data and Docs

Manual text entry

FTPDataset InfoInventory

info

CMR (ECHO, GCMD)

Dataset Landing Pages

Guide Documents

Metadata exports

Dynamic web docs

Downloadable documents

Email exchange

Page 5: Data Ingest Automation GHRC Status and Plans Helen Conover GHRC DAAC Operations Manager hconover@itsc.uah.edu Presented at ESIP Summer Meeting 2015.

GHRC Approach

• Plan to adapt ORNL DAAC process, workflow and software

• Gathering requirements from DBA and Data Management Groupo Review current manual processeso Evaluate proposed database schema changeso Define automated metadata publication paths

• Review design with DMG and User Working Group• Agile development approach

15 June 2015ESIP Summer Meeting 2015 5

Page 6: Data Ingest Automation GHRC Status and Plans Helen Conover GHRC DAAC Operations Manager hconover@itsc.uah.edu Presented at ESIP Summer Meeting 2015.

Data Ingest and PublicationInformation Flow

15 June 2015ESIP Summer Meeting 2015 6

Online Metadata

Editor

Science Pis• Proposed dataset title• Brief description• Basic metadata• Citation author list

• Data, imagery, documents, software (optional)

GHRC Staff• Final dataset title• GCMD keywords• Full metadata• DOI• Guide document text

Master Info Repository

Data and Docs

Manual text entry

FTP

Dataset Info

Inventory info

CMR (ECHO, GCMD)

Local ElasticsearchDataset Landing

Pages THREDDS Catalog

Guide Documents

Metadata exports

Dynamic web docs

Downloadable documents

Page 7: Data Ingest Automation GHRC Status and Plans Helen Conover GHRC DAAC Operations Manager hconover@itsc.uah.edu Presented at ESIP Summer Meeting 2015.

Notional Schedule

ESIP Summer Meeting 2015 7

Task Timeframe

Initial requirements / design phase July - August

Early working prototype collection level metadata editor September

Preview data provider screens at UWG meeting October 7-8

Implementation and test phaseOctober - November

Initial version of complete system December