NOAA’s 2 nd Annual EDMC Workshop22 June11. N ATIONAL C LIMATIC D ATA C ENTER P URPOSE AND O...

32
Ed Kearns Remote Sensing & Applications Division National Climatic Data Center DEFINING AN EFFICIENT CONCEPT OF OPERATIONS FOR NOAA’S ARCHIVE AND DATA STEWARDSHIP ACTIVITIES NOAA’s 2 nd Annual EDMC Workshop 22 June11

Transcript of NOAA’s 2 nd Annual EDMC Workshop22 June11. N ATIONAL C LIMATIC D ATA C ENTER P URPOSE AND O...

Page 1: NOAA’s 2 nd Annual EDMC Workshop22 June11. N ATIONAL C LIMATIC D ATA C ENTER P URPOSE AND O VERVIEW Purpose  Discuss how to best utilize NOAA’s Data.

Ed KearnsRemote Sensing & Applications DivisionNational Climatic Data Center

DEFINING AN EFFICIENT CONCEPT OF OPERATIONS FOR NOAA’S ARCHIVE AND DATA STEWARDSHIP ACTIVITIES

NOAA’s 2nd Annual EDMC Workshop22 June11

Page 2: NOAA’s 2 nd Annual EDMC Workshop22 June11. N ATIONAL C LIMATIC D ATA C ENTER P URPOSE AND O VERVIEW Purpose  Discuss how to best utilize NOAA’s Data.

NATIONAL CLIMATIC DATA CENTER

2 2ND ANNUAL EDMC WORKSHOP

PURPOSE AND OVERVIEWPurpose

Discuss how to best utilize NOAA’s Data Centers and the CLASS system to preserve and steward the information in NOAA’s archive system.

Propose options for functional organization to accomplish data ingest, archive storage, access, data management, and stewardship.

Overview Current Status of a Concept of Operations for CLASS &

the Data Centers Options for the Evolution of the Archive System

Tiered Services, Federated Systems How to Pay for it? Can we include Cloud Resources?

Recent example NCEP climate model data activity

22 June 11

Page 3: NOAA’s 2 nd Annual EDMC Workshop22 June11. N ATIONAL C LIMATIC D ATA C ENTER P URPOSE AND O VERVIEW Purpose  Discuss how to best utilize NOAA’s Data.

NATIONAL CLIMATIC DATA CENTER

3 2ND ANNUAL EDMC WORKSHOP

ARCHIVE GUIDANCE

Management of NOAA environmental data will be based upon an end-to-end data management lifecycle that includes:

Determining what environmental data are required to be preserved for the long term and how preservation will be accomplished

Developing and maintaining metadata throughout the environmental data lifecycle that comply with standards

Obtaining user requirements and feedback

Developing and following data management plans that are coordinated with the appropriate NOAA archive for all observing and data management systems

Conducting scientific data stewardship to address data content, access, and user understanding

Providing for delivery to the archive and secure storage

Providing for data access and dissemination

Enabling integration and/or interoperability with other information and products

NOAA ADMIN ORDER 212-15, SECTION 3. POLICY (4 NOV 10)

22 June 11

We’ve heard a consistent approach being followed by the Data Centers…

What to Archive

How to Archive

Page 4: NOAA’s 2 nd Annual EDMC Workshop22 June11. N ATIONAL C LIMATIC D ATA C ENTER P URPOSE AND O VERVIEW Purpose  Discuss how to best utilize NOAA’s Data.

NATIONAL CLIMATIC DATA CENTER

4 2ND ANNUAL EDMC WORKSHOP

Inter-Agency (e.g., NWS-CS MOA, in draft) Memorandum of Agreement

Producer-Archive (e.g., NCEP-NCDC SLA, in draft) Service Level Agreement

Formal Definition,

Transfer and Validation Phases

Preliminary Phase

ARCHIVE PROCESS

22 June 11

What to Archive Process

RequirementsDocuments

Initiate Request to Archive

RecommendationPackage

PreliminaryAgreement

SubmissionAgreement (SA)

Final Decision

How to Archive Process

DesignDocuments

Initial Decision

Design Approval

Test & ImplementArchive

• NAO 212-15, Management of Environmental Data and Information

• Procedure for Scientific Records Appraisal and Archive Approval (What to Archive)

• Open Archival Information System Reference Model (OAIS-RM)

• Producer-Archive Interface Methodology Abstract Standard (PAIMAS)

Page 5: NOAA’s 2 nd Annual EDMC Workshop22 June11. N ATIONAL C LIMATIC D ATA C ENTER P URPOSE AND O VERVIEW Purpose  Discuss how to best utilize NOAA’s Data.

NATIONAL CLIMATIC DATA CENTER

5 2ND ANNUAL EDMC WORKSHOP

ARCHIVE PROJECTED DATA VOLUMES

9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 300.00

50.00

100.00

150.00

200.00

250.00

300.00

CLASS Archive CUMULATIVE Total Volume in Petabytes (PB) TOTAL GOES - R, S, T, U

Other NODC, NGDC, NCDC

Model Data (Climate and Weather)

NEXRAD Wx Radar (plus DP & PA)

JPSS Series

NPP

METOP (Current + new launches)

DMSP (Current + new launches)

POES (Current to end of life 18 and 19)

GOES (Current to end of life of 14 and P)

Fiscal Year

Vo

lum

e in

Pet

abyt

es (

PB

)

22 June 11

Note not just the volume, but the disparate types of data here…

Must figure out economies for stewardship as well as archival storage

Page 6: NOAA’s 2 nd Annual EDMC Workshop22 June11. N ATIONAL C LIMATIC D ATA C ENTER P URPOSE AND O VERVIEW Purpose  Discuss how to best utilize NOAA’s Data.

6

1362-1998 - IEEE SYSTEM DEFINITION: A CONOPS IS A USER-ORIENTED DOCUMENT THAT DESCRIBES SYSTEM CHARACTERISTICS FOR A PROPOSED SYSTEM FROM THE USERS' VIEWPOINT.

THE CONOPS DOCUMENT IS USED TO COMMUNICATE THE OVERALL QUANTITATIVE AND QUALITATIVE SYSTEM CHARACTERISTICS TO THE USER, BUYER, DEVELOPER AND OTHER ORGANIZATIONAL ELEMENTS.

IT IS USED TO DESCRIBE THE USER ORGANIZATION(S), MISSION(S) AND ORGANIZATIONAL OBJECTIVES FROM AN INTEGRATED SYSTEMS POINT OF VIEW.

CONCEPT OF OPERATIONS

Page 7: NOAA’s 2 nd Annual EDMC Workshop22 June11. N ATIONAL C LIMATIC D ATA C ENTER P URPOSE AND O VERVIEW Purpose  Discuss how to best utilize NOAA’s Data.

NATIONAL CLIMATIC DATA CENTER

7 2ND ANNUAL EDMC WORKSHOP

CLASS INGEST CONOPS

22 June 11

Page 8: NOAA’s 2 nd Annual EDMC Workshop22 June11. N ATIONAL C LIMATIC D ATA C ENTER P URPOSE AND O VERVIEW Purpose  Discuss how to best utilize NOAA’s Data.

NATIONAL CLIMATIC DATA CENTER

8 2ND ANNUAL EDMC WORKSHOP

CLASS CONCEPT

Scope. Enterprise-wide IT system supporting long-term, secure storage of and common access to environmental datasets and information stewarded by NOAA’s Archives.

1. Large Data “Campaigns”: Satellites (NPP/JPSS, GOES, POES, DMSP, MetOp, EOS), Radar (NEXRAD), and NCEP Models (including reanalysis)

2. Enterprise Approach

Providing common services for development and operation of IT systems supporting NOAA Archives

Consolidating legacy archival storage systems to reduce acquisition costs

Relieving data producers of responsibility for archival development

LEVEL 1 REQUIREMENTS DOCUMENT (L1RD), 6 NOV 08

22 June 11

The CLASS Program is working on a complete ConOps now…

Vague

Specific

Page 9: NOAA’s 2 nd Annual EDMC Workshop22 June11. N ATIONAL C LIMATIC D ATA C ENTER P URPOSE AND O VERVIEW Purpose  Discuss how to best utilize NOAA’s Data.

NATIONAL CLIMATIC DATA CENTER

9 2ND ANNUAL EDMC WORKSHOP

CLASS-DATA CENTER SCHEMECURRENT “ENTERPRISE” IMPLEMENTATION

22 June 11

Wider NOAA and other Users/Consumers

Ingest/SubscriptionsCLASS Node

(NSOF)

CLASSNode

NCDCCLASSNode

NGDC

Replication

Archive Data Services

NODC

Page 10: NOAA’s 2 nd Annual EDMC Workshop22 June11. N ATIONAL C LIMATIC D ATA C ENTER P URPOSE AND O VERVIEW Purpose  Discuss how to best utilize NOAA’s Data.

NATIONAL CLIMATIC DATA CENTER

10 2ND ANNUAL EDMC WORKSHOP

UTILIZATION OF CLASS

All pieces of the OAIS-RM tailored for all NOAA A complete archive solution?

Too expensive. Corporate Cloud storage service

Data Centers to map their individual archive structures into CLASS?

Any NOAA entity can use via API?Too unwieldy if to be a real archive.

A managed Archive Storage service Limited ingest and access capabilities

Sustainable? Maybe…

IF NOT A COMPLETE ARCHIVE SOLUTION – WHAT PIECE TO PROVIDE?

22 June 11

Page 11: NOAA’s 2 nd Annual EDMC Workshop22 June11. N ATIONAL C LIMATIC D ATA C ENTER P URPOSE AND O VERVIEW Purpose  Discuss how to best utilize NOAA’s Data.

NATIONAL CLIMATIC DATA CENTER

11 2ND ANNUAL EDMC WORKSHOP

VERTICAL EVOLUTION: TIERED SERVICES

NOAA FRAMEWORK FOR DATA SERVICES

22 June 11

Locals, Regionals, Resellers

Minor Centers of Data

Major Centers of Data

Data Centers

Long Term

Archive

Few customers; basic services; large orders

Limited customers; directed services; medium

More customers; specific services;

Many customers; many diverse, tailored products

Handful of customers; extremely limited services; very large quantities

Page 12: NOAA’s 2 nd Annual EDMC Workshop22 June11. N ATIONAL C LIMATIC D ATA C ENTER P URPOSE AND O VERVIEW Purpose  Discuss how to best utilize NOAA’s Data.

NATIONAL CLIMATIC DATA CENTER

12 2ND ANNUAL EDMC WORKSHOP

CLASS-DATA CENTER SCHEMECURRENT “ENTERPRISE” IMPLEMENTATION

22 June 11

Wider NOAA and other Users/Consumers

Ingest/SubscriptionsCLASS Node

(NSOF)

CLASSNode

NCDCCLASSNode

NGDC

Replication

Archive Data Services

NODC

Page 13: NOAA’s 2 nd Annual EDMC Workshop22 June11. N ATIONAL C LIMATIC D ATA C ENTER P URPOSE AND O VERVIEW Purpose  Discuss how to best utilize NOAA’s Data.

NATIONAL CLIMATIC DATA CENTER

13 2ND ANNUAL EDMC WORKSHOP

CLASS-DATA CENTER SCHEMEALTERNATE IMPLEMENTATION – SHIFT LOADS

22 June 11

Other Users/Consumers

CLASS Receipt Node

CLASSNode(s)

NCDC NGDC

Replication

Archive Data Services

NODC

BigSystem

s

DAC DAC DAC DAC

Prog Prog Prog ProgProgProg$$

$$

Page 14: NOAA’s 2 nd Annual EDMC Workshop22 June11. N ATIONAL C LIMATIC D ATA C ENTER P URPOSE AND O VERVIEW Purpose  Discuss how to best utilize NOAA’s Data.

NATIONAL CLIMATIC DATA CENTER

14 2ND ANNUAL EDMC WORKSHOP

LATERAL EVOLUTION: FEDERATED SYSTEMS

Standards- and services-based framework which exploits multiple authoritative data sources that are separately administered

Each element can securely access data and metadata throughout the federation

Often no need to move data around, just move the descriptive information

Define workflows to move/distribute/change data and utilize federated resources

Example: Earth System Grid (ESG)

WHAT’S A “FEDERATION”?

22 June 11

Page 15: NOAA’s 2 nd Annual EDMC Workshop22 June11. N ATIONAL C LIMATIC D ATA C ENTER P URPOSE AND O VERVIEW Purpose  Discuss how to best utilize NOAA’s Data.

NATIONAL CLIMATIC DATA CENTER

15 2ND ANNUAL EDMC WORKSHOP

CLASS-DATA CENTER SCHEMEALTERNATE IMPLEMENTATION W/FEDERATION

22 June 11

Other Users/Consumers

CLASS Receipt Node

CLASSNode(s)

NCDC NGDC

Replication

Archive Data Services

NODC

BigSystem

s

DAC DAC DAC DAC

Prog Prog Prog ProgProgProg

Page 16: NOAA’s 2 nd Annual EDMC Workshop22 June11. N ATIONAL C LIMATIC D ATA C ENTER P URPOSE AND O VERVIEW Purpose  Discuss how to best utilize NOAA’s Data.

NATIONAL CLIMATIC DATA CENTER

16 2ND ANNUAL EDMC WORKSHOP

CLASS-DATA CENTER SCHEMECURRENT “ENTERPRISE” IMPLEMENTATION

22 June 11

Other Users/Consumers

CLASS Receipt Node

CLASSNode(s)

NCDC NGDC

Replication

Archive Data Services

NODC

BigSystem

s

DAC DAC DAC DAC

Prog Prog Prog ProgProgProg

Page 17: NOAA’s 2 nd Annual EDMC Workshop22 June11. N ATIONAL C LIMATIC D ATA C ENTER P URPOSE AND O VERVIEW Purpose  Discuss how to best utilize NOAA’s Data.

17

CHALLENGE: INGEST, ARCHIVE, AND PROVIDE ACCESS TO 100’S OF TB OF NCEP’S CLIMATE MODEL RESULTS WITH ESSENTIALLY NO NEW RESOURCES…IN A FEW MONTH’S TIME.

STORAGE AND DISTRIBUTION?OR ARCHIVE?

CASE STUDY: CLIMATE MODEL ARCHIVE

22 June 1117 2ND ANNUAL EDMC WORKSHOP

Page 18: NOAA’s 2 nd Annual EDMC Workshop22 June11. N ATIONAL C LIMATIC D ATA C ENTER P URPOSE AND O VERVIEW Purpose  Discuss how to best utilize NOAA’s Data.

NATIONAL CLIMATIC DATA CENTER

18 2ND ANNUAL EDMC WORKSHOP

CLIMATE MODEL DATA SCHEME

22 June 11

Stage 2 NCMP/

NOMADS

NCEP

Model

Data tar

format &

manifest

Ingest

TAR files and inventor

y contents

Model

Data at

NCDC

Add CLASS manifest and push

to CLASS ingest

TAR’d

Model

Data on

CLASS

system

Prioritized Model Data on CLASS Disk at NCDC

Archive to CLASS tape

Model

Data on

CLASS tape at

NGDC

Model

Data on

CLASS

tape at

NCDC

unTAR and move to CLASS

disk OP

eNDAP

restricted to NCMP, NOMADS

unTAR and move to CLASS

disk

(FY11)

Stage 1

(FY10)

Public Acces

s

Data Produc

er

Verification,

QA/QC

Ingest Archiv

e

Stewardship

Dedicated Access

U

sers

Page 19: NOAA’s 2 nd Annual EDMC Workshop22 June11. N ATIONAL C LIMATIC D ATA C ENTER P URPOSE AND O VERVIEW Purpose  Discuss how to best utilize NOAA’s Data.

NATIONAL CLIMATIC DATA CENTER

2ND ANNUAL EDMC WORKSHOP19

SERVICE VIA NOMADS

22 June 11

GDS/TDS

Portals

OPeNDAP

NOAA

LAS

Direct Client AccessGrADS, Ferret, MatLab, IDL, IDV, Web browsers or any OPeNDAP enabled client

NOMADS InterfaceNCMP Portal / GIS On-line Diagnostics

HTTP/FTP/GridFTP Access

TDS – THREDDS Data ServerGDS – GrADS Data ServerLAS – Live Access Server

Tier 3Archive(CLASS Tape)

Tier 1 or 2 (CLASS Disk)

Tiered-Levels of Storage

Earth System Grid (ESG)

Other Archives(CMIP5) Federated Storage

TDS

Page 20: NOAA’s 2 nd Annual EDMC Workshop22 June11. N ATIONAL C LIMATIC D ATA C ENTER P URPOSE AND O VERVIEW Purpose  Discuss how to best utilize NOAA’s Data.

NATIONAL CLIMATIC DATA CENTER

2ND ANNUAL EDMC WORKSHOP20

DIFFERENT ACCESS PATTERNS NOW

22 June 11

Jul-0

8

Sep-

08

Nov-0

8

Jan-

09

Mar-0

9

May-0

9

Jul-0

9

Sep-

09

Nov-0

9

Jan-

10

Mar-1

0

May-1

0

Jul-1

0

Sep-

10

Nov-1

0

Jan-

11

Mar-1

1

May-1

10

0.5

1

1.5

2

2.5

3

3.5

Proportion of Data In/Out of CLASS

Proportion of Average Ratio of Data In/Out

Page 21: NOAA’s 2 nd Annual EDMC Workshop22 June11. N ATIONAL C LIMATIC D ATA C ENTER P URPOSE AND O VERVIEW Purpose  Discuss how to best utilize NOAA’s Data.

NATIONAL CLIMATIC DATA CENTER

21 2ND ANNUAL EDMC WORKSHOP

CONCLUSIONS To achieve a sustainable solution for an enterprise

system to for information preservation across NOAA… CLASS and the Data Centers are important

assets that can be functionally reconfigured May must bring more NOAA assets in-line:

stewards Economies to be realized by tiers of service Federation may help level of service

Climate model data experience shows that tiers of service can work New access patterns emerging Support by expand technologies instead of

buying more hardware? Cloud? 22 June 11

Page 22: NOAA’s 2 nd Annual EDMC Workshop22 June11. N ATIONAL C LIMATIC D ATA C ENTER P URPOSE AND O VERVIEW Purpose  Discuss how to best utilize NOAA’s Data.

Ed KearnsDeputy Chief, Remote Sensing and Applications DivisionNOAA’s National Climatic Data Center (NCDC)151 Patton AvenueAsheville, NC 28807 828-350-2410

[email protected]

Page 23: NOAA’s 2 nd Annual EDMC Workshop22 June11. N ATIONAL C LIMATIC D ATA C ENTER P URPOSE AND O VERVIEW Purpose  Discuss how to best utilize NOAA’s Data.

NATIONAL CLIMATIC DATA CENTER

23 2ND ANNUAL EDMC WORKSHOP 22 June 11

Page 24: NOAA’s 2 nd Annual EDMC Workshop22 June11. N ATIONAL C LIMATIC D ATA C ENTER P URPOSE AND O VERVIEW Purpose  Discuss how to best utilize NOAA’s Data.

Ed KearnsRemote Sensing & Applications DivisionNational Climatic Data Center

BACKUP SLIDES

Page 25: NOAA’s 2 nd Annual EDMC Workshop22 June11. N ATIONAL C LIMATIC D ATA C ENTER P URPOSE AND O VERVIEW Purpose  Discuss how to best utilize NOAA’s Data.

NATIONAL CLIMATIC DATA CENTER

TIERS OR AXES OF SERVICE

Ingest and StorageBring in data from the producerProvide secure storageCosts constrained by technology advances

AccessProvide access to the data in storageCore capabilities to facilitate a wider

range of access functionality

StewardshipMaintain the data in storage Work with the data to preserve

information contentLabor intensive (expensive)

Ingest/Storage

AccessStewardship

Page 26: NOAA’s 2 nd Annual EDMC Workshop22 June11. N ATIONAL C LIMATIC D ATA C ENTER P URPOSE AND O VERVIEW Purpose  Discuss how to best utilize NOAA’s Data.

NATIONAL CLIMATIC DATA CENTER

NOAA’S DATA SYSTEMS FUNCTION IN A WIDER INFORMATION LANDSCAPE: A NESDIS VIEW

National Climatic Data Center

ORNL,ESG

NSF DataNe

t

DAPs Data Mgmt

IPCCInternational Sources

WMO GEO

NEAAT

OSSEs

Google

May 26, 2010

EDMC Workshop

Page 27: NOAA’s 2 nd Annual EDMC Workshop22 June11. N ATIONAL C LIMATIC D ATA C ENTER P URPOSE AND O VERVIEW Purpose  Discuss how to best utilize NOAA’s Data.

NATIONAL CLIMATIC DATA CENTER

27 2ND ANNUAL EDMC WORKSHOP

NEAAT/SNAAP

22 June 11

Page 28: NOAA’s 2 nd Annual EDMC Workshop22 June11. N ATIONAL C LIMATIC D ATA C ENTER P URPOSE AND O VERVIEW Purpose  Discuss how to best utilize NOAA’s Data.

NATIONAL CLIMATIC DATA CENTER

TIERED SERVICES: FOOD

National Climatic Data Center May 26, 2010

EDMC Workshop

Restaurants

Supermarkets

Big Box Clubs

Warehouses

Processing

Farmers, Fishermen, etc.

Very few customers, big trucks; very large quantities

Limited customers, big carts, sizable quantities

More customers, small carts; small quantities

Many customers; diverse, tailored products

Page 29: NOAA’s 2 nd Annual EDMC Workshop22 June11. N ATIONAL C LIMATIC D ATA C ENTER P URPOSE AND O VERVIEW Purpose  Discuss how to best utilize NOAA’s Data.

NATIONAL CLIMATIC DATA CENTER

TIERED SERVICES: DATA

National Climatic Data Center May 26, 2010

EDMC Workshop

Locals, Regionals, Resellers

Minor Centers of Data

Major Centers of Data

Data Centers

Long Term

Archive

NOAA data producers Few customers; basic

services; large orders

Limited customers; directed services; medium

More customers; specific services;

Many customers; many diverse, tailored products

Handful of customers; extremely limited services; very large quantities

StewardshipFunctions

Page 30: NOAA’s 2 nd Annual EDMC Workshop22 June11. N ATIONAL C LIMATIC D ATA C ENTER P URPOSE AND O VERVIEW Purpose  Discuss how to best utilize NOAA’s Data.

NATIONAL CLIMATIC DATA CENTER

HOW DOES A FEDERATION HELP WITH THE PROBLEMS?

Can leave the “active” and most easily accessible copy of the data with the Data Steward, in a specific service Tier, or out on the Cloud Simplifies stewardship functions Reduces storage and bandwidth requirements

Reduces load on Tier 0 resources (deep archive)

Workflows can be defined to process or otherwise change the data on-the-fly to meet specific member needs

National Climatic Data Center May 26, 2010

EDMC Workshop

Page 31: NOAA’s 2 nd Annual EDMC Workshop22 June11. N ATIONAL C LIMATIC D ATA C ENTER P URPOSE AND O VERVIEW Purpose  Discuss how to best utilize NOAA’s Data.

NATIONAL CLIMATIC DATA CENTER

HOW TO EMPLOY FEDERATION IN NOAA

National Climatic Data Center NCDC CLASS Strategic Vision

Deploy rules-based distributed data management software to the authoritative data sources

Provide tiered services, adopt common standards If secure storage is desired, enable coordinated

ingest and archive of data from the data steward to lower tier resources

Establish workflows to/from data sources Plug into cloud processing and storage resources

Provide virtual data collections to users for search+

Page 32: NOAA’s 2 nd Annual EDMC Workshop22 June11. N ATIONAL C LIMATIC D ATA C ENTER P URPOSE AND O VERVIEW Purpose  Discuss how to best utilize NOAA’s Data.

NATIONAL CLIMATIC DATA CENTER

CLASS AND THE DATA CENTERS IN THE NOAA FEDERATION

Full integration of CLASS into Data Center operations Storage, Ingest, Access Functions IOC to FOC transition, parallel testing

CLASS as a component of a NOAA Federated Archive Embrace distributed data management and

services

Adoption of technologies and standards for CLASS to be interoperable with ESG, GEO-IDE, EOSDIS, etc. netCDF, LDM, CF conventions, ISO 19115-2

Move out of the Box and into the Cloud Utilize highly distributed storage and computing

National Climatic Data Center May 26, 2010

EDMC Workshop