Towards Automated Model Output Analysis

27
Towards Automated Model Output Analysis Charles Doutriaux Ispra 2006

description

Towards Automated Model Output Analysis. Charles Doutriaux Ispra 2006. Summary. The Problem PCMDI’s Experience Overview CDAT ESG Collaborations The AutoMOD Project Portability. The Climate Model Data Management Issue. Usually. Today, Tomorrow. - PowerPoint PPT Presentation

Transcript of Towards Automated Model Output Analysis

Page 1: Towards Automated Model Output Analysis

Towards Automated Model Output Analysis

Charles Doutriaux

Ispra 2006

Page 2: Towards Automated Model Output Analysis

Summary

• The Problem

• PCMDI’s Experience– Overview– CDAT– ESG– Collaborations

• The AutoMOD Project

• Portability

Page 3: Towards Automated Model Output Analysis

The Climate Model Data Management Issue

• Data– Different formats – netCDF is not standardized

• Different sites require knowledge of different methods of access

• Metadata– Painful to produce– Most kept in files separate from data– unsearchable unless one is “in the

know” (some kept in people’s brains)• Access control

– Manual– Not formalized

• Data requests/analysis– Beginnings of a formal process

Beginnings of web portals– Far too much done by hand– Logging nearly non-existent

• Data– netCDF is standardized to the CF model– Different sites but standardized access

protocol.

• Metadata– Created via batch or semi-automated

processes– Kept in databases and readily

searchable • Access control

– Formalized– Highly granularized – down to per-file,

per-person level• Data requests/analysis

– Completely automated– All logging done automatically

Usually Today, Tomorrow

Computers do nearly all the work and scientists do more interesting things than shoving bytes around

Tremendous manual intervention, horribly inefficient by any measure, resource wasting

Page 4: Towards Automated Model Output Analysis

Proposed SolutionAutoMOD

Automated Model Diagnostic Facility • Web based-Interface• Automated Upload• Automated Atlases (pre-run/offline diagnosis)• Online Diagnosis• Searchable Database for Model, Simulations,

Diagnosis• Leverage from PCMDI’s experience

Page 5: Towards Automated Model Output Analysis

PCMDI

• Goal– Serve Climate Community

• Computation Team Goals– Provide the scientific community with tools to

allow them to focus on science NOT on technical aspects.

Page 6: Towards Automated Model Output Analysis

PCMDI Solutions

• Analysis: Climate Data Analysis Tools

• Data serving: ESG

Page 7: Towards Automated Model Output Analysis

Climate Data Analysis Tools Design/Philosophy

• Leveraged from community’s work, for the community• Cutting-EdgeTechnology

– Tomorrow’s Analysis Tools Today.

• Quality driven but balanced by functional requirements. • Python based

– Flexible– Efficient

• Open Source, Open Community.– Knowledge sharing– Time saving– Leveraging from others work

Page 8: Towards Automated Model Output Analysis

Climate Data Analysis Tools (CDAT)

CDAT

Climate Data Analysis Tools• Python based system• Added packages by community• One environment, • Community Software

Page 9: Towards Automated Model Output Analysis

CDAT’s Modularity

Scripts / VCDAT

XM

GR

AC

E

VC

S

VT

K

Python

NE

TC

DF

PP

HD

F4

GrA

DS

CD

MS

DR

S

XM

L

Cdunif.soGraphics method

BO

XF

ILL

ME

SH

FIL

L

ISO

LIN

E

ISO

FIL

L

OU

TF

ILL

OU

TL

INE

YX

vsX

XY

vsY

XvsY

TA

YL

OR

SC

AT

TE

R

VE

CT

OR

_vcs.so

CanvasC o

r Fo

rtran

CommunityContributed Packages

f2pyPyfort

CDMS

Dataset

Numeric

MVMA

VariableAxisGrid

Page 10: Towards Automated Model Output Analysis

CDATExamples

Page 11: Towards Automated Model Output Analysis

Earth System Grid

• Collaboration and data sharing• Location-independent equal-access to shared

resources – (data, visualization, supercomputers, experiments,

whiteboard, etc..)• Evolution from centralized data sharing to

distributed data-sharing.• Allow for geographically distributed teams.• Allow researchers to focus on science not data

set manipulation.

Page 12: Towards Automated Model Output Analysis

Federated MetadataCatalog

User InterfaceServer

ESG Virtual Server

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Federated ESG Sites

ESG Product Request Protocol Publish to Federated Catalog

HTTP

Catalog

Data

ESG Node

Site 2

Catalog

Data

ESG Node

Site 1

Catalog

Data

ESG Node

Site 3

Catalog

Data

ESG Node

Site 4

Page 13: Towards Automated Model Output Analysis

External collaborationsPortal examples

GeoSPLATBritish Atmospheric Data Centre

Live Access ServerPacific Marine Environmental Laboratory

Page 14: Towards Automated Model Output Analysis

AutoMOD Automated Model Diagnostic

Facility • Freely available

• Automated Upload

• Automated Atlases

• Searchable Database for Model, Simulations, Diagnosis

• System can evolve to incorporate new/future standard

Page 15: Towards Automated Model Output Analysis

Existing StructureThe “manual” approach

Model Output Archive

Analysis scripts (python/CDAT,

IDL, Ferret, Tecplot)

Analysis/MIP Guru(s)

User

Page 16: Towards Automated Model Output Analysis

AutoMOD StructureThe “Automated” approach

Model Output Archive

Web Server(powered with CDAT)

User

Simple HTTP

Database

FTP

Page 17: Towards Automated Model Output Analysis

AutoMODMySQL Database

• Variable information• Modeling Groups Information

– Various versions information

• Simulations Information (associated with a model version)

• Working Groups Information• Unix-like Read/Write Authorization, per

Group/User, Model version or simulation level

Page 18: Towards Automated Model Output Analysis

AutoMOD Web Interface

• Apache Server, with builtin CDAT via mod_python module

• MySQL interface– Permissions– Archive Content– AutoMOD Project Info (users, groups, etc…)

• Online CDAT Diagnosis– Allows to restrict ressources available to the user

(server swamping issues)– Use pre-loaded CDAT/Python (faster)– System 100% free and Open-Source

Page 19: Towards Automated Model Output Analysis

AutoMOD Web Interface

Page 20: Towards Automated Model Output Analysis

AutoMOD Web Interface

Page 21: Towards Automated Model Output Analysis

AutoMOD Web Interface

Page 22: Towards Automated Model Output Analysis

AutoMOD Web Interface

Page 23: Towards Automated Model Output Analysis

AutoMODUsers Requirements/Restrictions

• Data/Metadata must adhere to strict standards– NetCDF format, CF compliant– http://www.cgd.ucar.edu/cms/eaton/cf-metada

ta

• Data must be pre-processed by user via “output” subroutine, provided to them.

• No “user-defined” Diagnosis

Page 24: Towards Automated Model Output Analysis

AutoMODStatus and Upcoming Tasks

• Web administrative interface finished, only needs “look and feel” finish.

• OCMIP2 Data in the system

• First basic diagnosis in beta phase

• Extensive “Atlas” and “online” diagnosis will be added through the year.

• Possibility to move to ESG Data serving

Page 25: Towards Automated Model Output Analysis

Portability

• CF Compliant data should plug-in without any changes.

• Potential changes could be:– Project specific assimilation script– Adding/Removing Variable information stored

into the database

• Adding project specific diagnosis will obviously be needed

Page 26: Towards Automated Model Output Analysis

PortabilityPossible Extensions

• Replace Data archive system, with Earth System Grid.

• Not needed by AutoMOD at the moment, but would be for projects such as IPCC (huge datasets, more stringent security requirements)

Page 27: Towards Automated Model Output Analysis

Conclusions

• AutoMOD: Hassle-free environment– Leveraged from PCMDI’s experience:

• Powerful CDAT as diagnosis backend• Possibility of use of ESG system Easy upload of

dataset

– Centralized resources– Easy User Interface– Ever growing list of diagnosis– Easily Portable