NOAA’s 2 nd Annual EDMC Workshop22 June11. N ATIONAL C LIMATIC D ATA C ENTER P URPOSE AND O...
-
Upload
percival-cooper -
Category
Documents
-
view
222 -
download
4
Transcript of NOAA’s 2 nd Annual EDMC Workshop22 June11. N ATIONAL C LIMATIC D ATA C ENTER P URPOSE AND O...
Ed KearnsRemote Sensing & Applications DivisionNational Climatic Data Center
DEFINING AN EFFICIENT CONCEPT OF OPERATIONS FOR NOAA’S ARCHIVE AND DATA STEWARDSHIP ACTIVITIES
NOAA’s 2nd Annual EDMC Workshop22 June11
NATIONAL CLIMATIC DATA CENTER
2 2ND ANNUAL EDMC WORKSHOP
PURPOSE AND OVERVIEWPurpose
Discuss how to best utilize NOAA’s Data Centers and the CLASS system to preserve and steward the information in NOAA’s archive system.
Propose options for functional organization to accomplish data ingest, archive storage, access, data management, and stewardship.
Overview Current Status of a Concept of Operations for CLASS &
the Data Centers Options for the Evolution of the Archive System
Tiered Services, Federated Systems How to Pay for it? Can we include Cloud Resources?
Recent example NCEP climate model data activity
22 June 11
NATIONAL CLIMATIC DATA CENTER
3 2ND ANNUAL EDMC WORKSHOP
ARCHIVE GUIDANCE
Management of NOAA environmental data will be based upon an end-to-end data management lifecycle that includes:
Determining what environmental data are required to be preserved for the long term and how preservation will be accomplished
Developing and maintaining metadata throughout the environmental data lifecycle that comply with standards
Obtaining user requirements and feedback
Developing and following data management plans that are coordinated with the appropriate NOAA archive for all observing and data management systems
Conducting scientific data stewardship to address data content, access, and user understanding
Providing for delivery to the archive and secure storage
Providing for data access and dissemination
Enabling integration and/or interoperability with other information and products
NOAA ADMIN ORDER 212-15, SECTION 3. POLICY (4 NOV 10)
22 June 11
We’ve heard a consistent approach being followed by the Data Centers…
What to Archive
How to Archive
NATIONAL CLIMATIC DATA CENTER
4 2ND ANNUAL EDMC WORKSHOP
Inter-Agency (e.g., NWS-CS MOA, in draft) Memorandum of Agreement
Producer-Archive (e.g., NCEP-NCDC SLA, in draft) Service Level Agreement
Formal Definition,
Transfer and Validation Phases
Preliminary Phase
ARCHIVE PROCESS
22 June 11
What to Archive Process
RequirementsDocuments
Initiate Request to Archive
RecommendationPackage
PreliminaryAgreement
SubmissionAgreement (SA)
Final Decision
How to Archive Process
DesignDocuments
Initial Decision
Design Approval
Test & ImplementArchive
• NAO 212-15, Management of Environmental Data and Information
• Procedure for Scientific Records Appraisal and Archive Approval (What to Archive)
• Open Archival Information System Reference Model (OAIS-RM)
• Producer-Archive Interface Methodology Abstract Standard (PAIMAS)
NATIONAL CLIMATIC DATA CENTER
5 2ND ANNUAL EDMC WORKSHOP
ARCHIVE PROJECTED DATA VOLUMES
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 300.00
50.00
100.00
150.00
200.00
250.00
300.00
CLASS Archive CUMULATIVE Total Volume in Petabytes (PB) TOTAL GOES - R, S, T, U
Other NODC, NGDC, NCDC
Model Data (Climate and Weather)
NEXRAD Wx Radar (plus DP & PA)
JPSS Series
NPP
METOP (Current + new launches)
DMSP (Current + new launches)
POES (Current to end of life 18 and 19)
GOES (Current to end of life of 14 and P)
Fiscal Year
Vo
lum
e in
Pet
abyt
es (
PB
)
22 June 11
Note not just the volume, but the disparate types of data here…
Must figure out economies for stewardship as well as archival storage
6
1362-1998 - IEEE SYSTEM DEFINITION: A CONOPS IS A USER-ORIENTED DOCUMENT THAT DESCRIBES SYSTEM CHARACTERISTICS FOR A PROPOSED SYSTEM FROM THE USERS' VIEWPOINT.
THE CONOPS DOCUMENT IS USED TO COMMUNICATE THE OVERALL QUANTITATIVE AND QUALITATIVE SYSTEM CHARACTERISTICS TO THE USER, BUYER, DEVELOPER AND OTHER ORGANIZATIONAL ELEMENTS.
IT IS USED TO DESCRIBE THE USER ORGANIZATION(S), MISSION(S) AND ORGANIZATIONAL OBJECTIVES FROM AN INTEGRATED SYSTEMS POINT OF VIEW.
CONCEPT OF OPERATIONS
NATIONAL CLIMATIC DATA CENTER
7 2ND ANNUAL EDMC WORKSHOP
CLASS INGEST CONOPS
22 June 11
NATIONAL CLIMATIC DATA CENTER
8 2ND ANNUAL EDMC WORKSHOP
CLASS CONCEPT
Scope. Enterprise-wide IT system supporting long-term, secure storage of and common access to environmental datasets and information stewarded by NOAA’s Archives.
1. Large Data “Campaigns”: Satellites (NPP/JPSS, GOES, POES, DMSP, MetOp, EOS), Radar (NEXRAD), and NCEP Models (including reanalysis)
2. Enterprise Approach
Providing common services for development and operation of IT systems supporting NOAA Archives
Consolidating legacy archival storage systems to reduce acquisition costs
Relieving data producers of responsibility for archival development
LEVEL 1 REQUIREMENTS DOCUMENT (L1RD), 6 NOV 08
22 June 11
The CLASS Program is working on a complete ConOps now…
Vague
Specific
NATIONAL CLIMATIC DATA CENTER
9 2ND ANNUAL EDMC WORKSHOP
CLASS-DATA CENTER SCHEMECURRENT “ENTERPRISE” IMPLEMENTATION
22 June 11
Wider NOAA and other Users/Consumers
Ingest/SubscriptionsCLASS Node
(NSOF)
CLASSNode
NCDCCLASSNode
NGDC
Replication
Archive Data Services
NODC
NATIONAL CLIMATIC DATA CENTER
10 2ND ANNUAL EDMC WORKSHOP
UTILIZATION OF CLASS
All pieces of the OAIS-RM tailored for all NOAA A complete archive solution?
Too expensive. Corporate Cloud storage service
Data Centers to map their individual archive structures into CLASS?
Any NOAA entity can use via API?Too unwieldy if to be a real archive.
A managed Archive Storage service Limited ingest and access capabilities
Sustainable? Maybe…
IF NOT A COMPLETE ARCHIVE SOLUTION – WHAT PIECE TO PROVIDE?
22 June 11
NATIONAL CLIMATIC DATA CENTER
11 2ND ANNUAL EDMC WORKSHOP
VERTICAL EVOLUTION: TIERED SERVICES
NOAA FRAMEWORK FOR DATA SERVICES
22 June 11
Locals, Regionals, Resellers
Minor Centers of Data
Major Centers of Data
Data Centers
Long Term
Archive
Few customers; basic services; large orders
Limited customers; directed services; medium
More customers; specific services;
Many customers; many diverse, tailored products
Handful of customers; extremely limited services; very large quantities
NATIONAL CLIMATIC DATA CENTER
12 2ND ANNUAL EDMC WORKSHOP
CLASS-DATA CENTER SCHEMECURRENT “ENTERPRISE” IMPLEMENTATION
22 June 11
Wider NOAA and other Users/Consumers
Ingest/SubscriptionsCLASS Node
(NSOF)
CLASSNode
NCDCCLASSNode
NGDC
Replication
Archive Data Services
NODC
NATIONAL CLIMATIC DATA CENTER
13 2ND ANNUAL EDMC WORKSHOP
CLASS-DATA CENTER SCHEMEALTERNATE IMPLEMENTATION – SHIFT LOADS
22 June 11
Other Users/Consumers
CLASS Receipt Node
CLASSNode(s)
NCDC NGDC
Replication
Archive Data Services
NODC
BigSystem
s
DAC DAC DAC DAC
Prog Prog Prog ProgProgProg$$
$$
NATIONAL CLIMATIC DATA CENTER
14 2ND ANNUAL EDMC WORKSHOP
LATERAL EVOLUTION: FEDERATED SYSTEMS
Standards- and services-based framework which exploits multiple authoritative data sources that are separately administered
Each element can securely access data and metadata throughout the federation
Often no need to move data around, just move the descriptive information
Define workflows to move/distribute/change data and utilize federated resources
Example: Earth System Grid (ESG)
WHAT’S A “FEDERATION”?
22 June 11
NATIONAL CLIMATIC DATA CENTER
15 2ND ANNUAL EDMC WORKSHOP
CLASS-DATA CENTER SCHEMEALTERNATE IMPLEMENTATION W/FEDERATION
22 June 11
Other Users/Consumers
CLASS Receipt Node
CLASSNode(s)
NCDC NGDC
Replication
Archive Data Services
NODC
BigSystem
s
DAC DAC DAC DAC
Prog Prog Prog ProgProgProg
NATIONAL CLIMATIC DATA CENTER
16 2ND ANNUAL EDMC WORKSHOP
CLASS-DATA CENTER SCHEMECURRENT “ENTERPRISE” IMPLEMENTATION
22 June 11
Other Users/Consumers
CLASS Receipt Node
CLASSNode(s)
NCDC NGDC
Replication
Archive Data Services
NODC
BigSystem
s
DAC DAC DAC DAC
Prog Prog Prog ProgProgProg
17
CHALLENGE: INGEST, ARCHIVE, AND PROVIDE ACCESS TO 100’S OF TB OF NCEP’S CLIMATE MODEL RESULTS WITH ESSENTIALLY NO NEW RESOURCES…IN A FEW MONTH’S TIME.
STORAGE AND DISTRIBUTION?OR ARCHIVE?
CASE STUDY: CLIMATE MODEL ARCHIVE
22 June 1117 2ND ANNUAL EDMC WORKSHOP
NATIONAL CLIMATIC DATA CENTER
18 2ND ANNUAL EDMC WORKSHOP
CLIMATE MODEL DATA SCHEME
22 June 11
Stage 2 NCMP/
NOMADS
NCEP
Model
Data tar
format &
manifest
Ingest
TAR files and inventor
y contents
Model
Data at
NCDC
Add CLASS manifest and push
to CLASS ingest
TAR’d
Model
Data on
CLASS
system
Prioritized Model Data on CLASS Disk at NCDC
Archive to CLASS tape
Model
Data on
CLASS tape at
NGDC
Model
Data on
CLASS
tape at
NCDC
unTAR and move to CLASS
disk OP
eNDAP
restricted to NCMP, NOMADS
unTAR and move to CLASS
disk
(FY11)
Stage 1
(FY10)
Public Acces
s
Data Produc
er
Verification,
QA/QC
Ingest Archiv
e
Stewardship
Dedicated Access
U
sers
NATIONAL CLIMATIC DATA CENTER
2ND ANNUAL EDMC WORKSHOP19
SERVICE VIA NOMADS
22 June 11
GDS/TDS
Portals
OPeNDAP
NOAA
LAS
Direct Client AccessGrADS, Ferret, MatLab, IDL, IDV, Web browsers or any OPeNDAP enabled client
NOMADS InterfaceNCMP Portal / GIS On-line Diagnostics
HTTP/FTP/GridFTP Access
TDS – THREDDS Data ServerGDS – GrADS Data ServerLAS – Live Access Server
Tier 3Archive(CLASS Tape)
Tier 1 or 2 (CLASS Disk)
Tiered-Levels of Storage
Earth System Grid (ESG)
Other Archives(CMIP5) Federated Storage
TDS
NATIONAL CLIMATIC DATA CENTER
2ND ANNUAL EDMC WORKSHOP20
DIFFERENT ACCESS PATTERNS NOW
22 June 11
Jul-0
8
Sep-
08
Nov-0
8
Jan-
09
Mar-0
9
May-0
9
Jul-0
9
Sep-
09
Nov-0
9
Jan-
10
Mar-1
0
May-1
0
Jul-1
0
Sep-
10
Nov-1
0
Jan-
11
Mar-1
1
May-1
10
0.5
1
1.5
2
2.5
3
3.5
Proportion of Data In/Out of CLASS
Proportion of Average Ratio of Data In/Out
NATIONAL CLIMATIC DATA CENTER
21 2ND ANNUAL EDMC WORKSHOP
CONCLUSIONS To achieve a sustainable solution for an enterprise
system to for information preservation across NOAA… CLASS and the Data Centers are important
assets that can be functionally reconfigured May must bring more NOAA assets in-line:
stewards Economies to be realized by tiers of service Federation may help level of service
Climate model data experience shows that tiers of service can work New access patterns emerging Support by expand technologies instead of
buying more hardware? Cloud? 22 June 11
Ed KearnsDeputy Chief, Remote Sensing and Applications DivisionNOAA’s National Climatic Data Center (NCDC)151 Patton AvenueAsheville, NC 28807 828-350-2410
NATIONAL CLIMATIC DATA CENTER
23 2ND ANNUAL EDMC WORKSHOP 22 June 11
Ed KearnsRemote Sensing & Applications DivisionNational Climatic Data Center
BACKUP SLIDES
NATIONAL CLIMATIC DATA CENTER
TIERS OR AXES OF SERVICE
Ingest and StorageBring in data from the producerProvide secure storageCosts constrained by technology advances
AccessProvide access to the data in storageCore capabilities to facilitate a wider
range of access functionality
StewardshipMaintain the data in storage Work with the data to preserve
information contentLabor intensive (expensive)
Ingest/Storage
AccessStewardship
NATIONAL CLIMATIC DATA CENTER
NOAA’S DATA SYSTEMS FUNCTION IN A WIDER INFORMATION LANDSCAPE: A NESDIS VIEW
National Climatic Data Center
ORNL,ESG
NSF DataNe
t
DAPs Data Mgmt
IPCCInternational Sources
WMO GEO
NEAAT
OSSEs
May 26, 2010
EDMC Workshop
NATIONAL CLIMATIC DATA CENTER
27 2ND ANNUAL EDMC WORKSHOP
NEAAT/SNAAP
22 June 11
NATIONAL CLIMATIC DATA CENTER
TIERED SERVICES: FOOD
National Climatic Data Center May 26, 2010
EDMC Workshop
Restaurants
Supermarkets
Big Box Clubs
Warehouses
Processing
Farmers, Fishermen, etc.
Very few customers, big trucks; very large quantities
Limited customers, big carts, sizable quantities
More customers, small carts; small quantities
Many customers; diverse, tailored products
NATIONAL CLIMATIC DATA CENTER
TIERED SERVICES: DATA
National Climatic Data Center May 26, 2010
EDMC Workshop
Locals, Regionals, Resellers
Minor Centers of Data
Major Centers of Data
Data Centers
Long Term
Archive
NOAA data producers Few customers; basic
services; large orders
Limited customers; directed services; medium
More customers; specific services;
Many customers; many diverse, tailored products
Handful of customers; extremely limited services; very large quantities
StewardshipFunctions
NATIONAL CLIMATIC DATA CENTER
HOW DOES A FEDERATION HELP WITH THE PROBLEMS?
Can leave the “active” and most easily accessible copy of the data with the Data Steward, in a specific service Tier, or out on the Cloud Simplifies stewardship functions Reduces storage and bandwidth requirements
Reduces load on Tier 0 resources (deep archive)
Workflows can be defined to process or otherwise change the data on-the-fly to meet specific member needs
National Climatic Data Center May 26, 2010
EDMC Workshop
NATIONAL CLIMATIC DATA CENTER
HOW TO EMPLOY FEDERATION IN NOAA
National Climatic Data Center NCDC CLASS Strategic Vision
Deploy rules-based distributed data management software to the authoritative data sources
Provide tiered services, adopt common standards If secure storage is desired, enable coordinated
ingest and archive of data from the data steward to lower tier resources
Establish workflows to/from data sources Plug into cloud processing and storage resources
Provide virtual data collections to users for search+
NATIONAL CLIMATIC DATA CENTER
CLASS AND THE DATA CENTERS IN THE NOAA FEDERATION
Full integration of CLASS into Data Center operations Storage, Ingest, Access Functions IOC to FOC transition, parallel testing
CLASS as a component of a NOAA Federated Archive Embrace distributed data management and
services
Adoption of technologies and standards for CLASS to be interoperable with ESG, GEO-IDE, EOSDIS, etc. netCDF, LDM, CF conventions, ISO 19115-2
Move out of the Box and into the Cloud Utilize highly distributed storage and computing
National Climatic Data Center May 26, 2010
EDMC Workshop