AVA I L A B L E : A L L S L I DE S A N D M AT E R I A LS W...

36
ALL SLIDES AND MATERIALS WILL BE ONLINE ALL SLIDES AND MATERIALS WILL BE ONLINE AVAILABLE: AVAILABLE: www.pik-potsdam.de/~menz/IKI-Oasis/capacity_building

Transcript of AVA I L A B L E : A L L S L I DE S A N D M AT E R I A LS W...

  • ALL SLIDES AND MATERIALS WILL BE ONLINEALL SLIDES AND MATERIALS WILL BE ONLINEAVAILABLE:AVAILABLE:

    www.pik-potsdam.de/~menz/IKI-Oasis/capacity_building

    http://www.pik-potsdam.de/~menz/IKI-Oasis/capacity_building

  • TOOLS FOR CLIMATE DATATOOLS FOR CLIMATE DATA

  • OVERVIEWOVERVIEWLecture: Introduction to netCDF and Tools for netCDF Data ProcessingHands-On: Pre-Processing and Visualization of netCDF Data

    TOPICSTOPICSIntroduction to netCDF

    Specification of netCDFMetadata standard CF and DRS

    Tools for netCDF Data ProcessingView netCDF file content with ncdumpPre-process netCDF data with cdoVisualize netCDF file with panoply

  • LECTURE:LECTURE:INTRODUCTION TO NETCDF ANDINTRODUCTION TO NETCDF AND

    TOOLS FOR NETCDF DATATOOLS FOR NETCDF DATAPROCESSINGPROCESSING

  • INTRODUCTION TO NETCDFINTRODUCTION TO NETCDF

  • INTRODUCTION TO NETCDFINTRODUCTION TO NETCDFNetwork Common Data Format (netCDF) is a self-describing and machine-independentdata format for array oriented dataIntroduced in 1989 and still the standard used to store gridded climate data(observations and models)Implementation of core libraries in C, Fortran, java, python, R, matlab, etc.Core features:

    Portable (works with various platforms: Linux, Windows, Mac, Clusters)Self-describing (meta information included)Large data (file size of several GB)

    Various tools to read, write, modify and visualize netCDF files (ncdump, ncview,panoply, nco, cdo, etc.)Extention: *.nc or *.nc4Supports compression of the data

  • STRUCTURE OF A NETCDF-FILESTRUCTURE OF A NETCDF-FILEA netCDF-File comprises two parts:

    Header part, contains all information about the underlying data (dimensions,variables, attributes)Data part, contains the actual data described in the header

    HEADERHEADERDimensions: Definition of name and sizeof dimensionsVariables: Definition of type, name andattribute of variablesGlobal attributes: Definition of attributesconcerning the whole dataset

    DATADATAFixed: Fields of fixed sizeRecord: Fields of potentially unlimitedsize (e.g. time)Compact non-human-readable format

  • STRUCTURE OF A NETCDF-FILESTRUCTURE OF A NETCDF-FILEYou can create dimensions and variables without additional informationsMandatory:

    Dimension with name and sizeVariables with name, shape and some data

    Optional fields are variable attributes and global attributesAdditional human-interpreted information's about the data (e.g. detailed name ofa variable, units, calendar, experiment, institute, author, etc.)

    Names and sizes/shapes of dimensions and variables are arbitrary (since you canexplain them within the variable/global-fields)This can cause confusion when handling/interpreting the data!Common agreement in using the same names and attributes as defined inCF/CMIP5/CORDEX

    Example

    name standard_name unittas air_temperature [K]pr precipitation_flux [kgm^2/s]ps surface_pressure [Pa]

  • DATA REFERENCE SYNTAX - DRSDATA REFERENCE SYNTAX - DRSReference filenames, directories and metadata for global and regional climate modelsDepends on the CF standard names (metadata conventions for netCDF) + controledvocabularyGood practice: Use as much DRS as possible for every datasetExamples:

  • DATA REFERENCE SYNTAX - DRSDATA REFERENCE SYNTAX - DRSGlobal Climate Model:

  • DATA REFERENCE SYNTAX - DRSDATA REFERENCE SYNTAX - DRSGlobal Climate Model:

  • DATA REFERENCE SYNTAX - DRSDATA REFERENCE SYNTAX - DRSGlobal Climate Model:

    Variable: Variable name (tas, tasmin, tasmax, pr, ps, ...)

  • DATA REFERENCE SYNTAX - DRSDATA REFERENCE SYNTAX - DRSGlobal Climate Model:

    Variable: Variable name (tas, tasmin, tasmax, pr, ps, ...)Frequency: Time frequency used (day, mon, year, ...)

  • DATA REFERENCE SYNTAX - DRSDATA REFERENCE SYNTAX - DRSGlobal Climate Model:

    Variable: Variable name (tas, tasmin, tasmax, pr, ps, ...)Frequency: Time frequency used (day, mon, year, ...)GCM: Name of global model (MPI-ESM-LR, HadGEM2-ES, ...)

  • DATA REFERENCE SYNTAX - DRSDATA REFERENCE SYNTAX - DRSGlobal Climate Model:

    Variable: Variable name (tas, tasmin, tasmax, pr, ps, ...)Frequency: Time frequency used (day, mon, year, ...)GCM: Name of global model (MPI-ESM-LR, HadGEM2-ES, ...)Experiment: E.g. greenhouse gas emission scenario (historical, rcp26, rcp45, rcp85)

  • DATA REFERENCE SYNTAX - DRSDATA REFERENCE SYNTAX - DRSGlobal Climate Model:

    Variable: Variable name (tas, tasmin, tasmax, pr, ps, ...)Frequency: Time frequency used (day, mon, year, ...)GCM: Name of global model (MPI-ESM-LR, HadGEM2-ES, ...)Experiment: E.g. greenhouse gas emission scenario (historical, rcp26, rcp45, rcp85)Realization: GCM realization (r1i1p1, r2i1p1, ...) - different runs of same GCM

  • DATA REFERENCE SYNTAX - DRSDATA REFERENCE SYNTAX - DRSGlobal Climate Model:

    Variable: Variable name (tas, tasmin, tasmax, pr, ps, ...)Frequency: Time frequency used (day, mon, year, ...)GCM: Name of global model (MPI-ESM-LR, HadGEM2-ES, ...)Experiment: E.g. greenhouse gas emission scenario (historical, rcp26, rcp45, rcp85)Realization: GCM realization (r1i1p1, r2i1p1, ...) - different runs of same GCMTimeframe: Timeframe covered by file (yyyymmdd-yyyymmdd)

  • DATA REFERENCE SYNTAX - DRSDATA REFERENCE SYNTAX - DRSGlobal Climate Model:

    Variable: Variable name (tas, tasmin, tasmax, pr, ps, ...)Frequency: Time frequency used (day, mon, year, ...)GCM: Name of global model (MPI-ESM-LR, HadGEM2-ES, ...)Experiment: E.g. greenhouse gas emission scenario (historical, rcp26, rcp45, rcp85)Realization: GCM realization (r1i1p1, r2i1p1, ...) - different runs of same GCMTimeframe: Timeframe covered by file (yyyymmdd-yyyymmdd)File-Extention: Extention .nc or .nc4 for most netCDF files

  • TOOLS FOR NETCDF DATATOOLS FOR NETCDF DATAPROCESSINGPROCESSING

  • TOOLS FOR NETCDF DATATOOLS FOR NETCDF DATAPROCESSINGPROCESSING

  • TOOLS FOR NETCDF DATATOOLS FOR NETCDF DATAPROCESSINGPROCESSING

    1. Check netCDF Data: ncdump

  • TOOLS FOR NETCDF DATATOOLS FOR NETCDF DATAPROCESSINGPROCESSING

    1. Check netCDF Data: ncdump2. Manipulation and Pre-Processing of

    netCDF Data: cdo

  • TOOLS FOR NETCDF DATATOOLS FOR NETCDF DATAPROCESSINGPROCESSING

    1. Check netCDF Data: ncdump2. Manipulation and Pre-Processing of

    netCDF Data: cdo3. Visualization of netCDF Data: panoply

  • CHECK NETCDF DATA: NCDUMPCHECK NETCDF DATA: NCDUMPOne of the most used netCDF toolsGenerates a human-readable representation of a netCDF fileNcdump is part of the netCDF library package

    SYNTAXSYNTAXHeader only: ncdump -h filename

    Single variable: ncdump -v variablename filename

    [menz@login01: tas]> ncdump -t tas_Amon_MPI-ESM-LR_historical_r1i1p1_185001-200512.nc

    netcdf tas_Amon_MPI-ESM-LR_historical_r1i1p1_185001-200512 {

    dimensions:

    time = UNLIMITED ; // (1872 currently)

    lat = 96 ;

    lon = 192 ;

    variables:

    double time(time) ;

    time:units = "days since 1850-1-1 00:00:00" ;

    time:calendar = "proleptic_gregorian" ;

    time:axis = "T" ;

    time:long_name = "time" ;

    time:standard_name = "time" ;

    double lat(lat) ;

    lat:units = "degrees_north" ;

  • NETCDF TOOLS: NCDUMPNETCDF TOOLS: NCDUMP

    HEADERHEADERContains the description of netCDF file ONLY

    DimensionsVariablesGlobal Attributes

    DATADATAContains the actual data values

    Fixed (time-independent)Record (time-dependent)

  • MANIPULATION AND PRE-PROCESSING OF NETCDFMANIPULATION AND PRE-PROCESSING OF NETCDFDATA: CDODATA: CDO

    Climate Data Operators is a collection of operators to analyze and manipulate climatedataDeveloped and maintained by Max-Planck-Institute for Meteorology, Hamburg

    Works completely on the console:https://code.mpimet.mpg.de/projects/cdo/

    [menz@login02: ~]> cdo --help

    usage : cdo [Options] Operator1 [-Operator2 [-OperatorN]]

    Options:

    -a Generate an absolute time axis

    -b Set the number of bits for the output precision

    (I8/I16/I32/F32/F64 for nc/nc2/nc4/nc4c; F32/F64 for grb2/srv/ext/ieg; P1 - P24 for g

    Add L or B to set the byteorder to Little or Big endian

    -f, --format

    Format of the output file. (grb/grb2/nc/nc2/nc4/nc4c/srv/ext/ieg)

    -g Set default grid name or file. Available grids:

    n, t, tl, global_, rx, gx, gme, lon=/lat=<

    -h, --help Help information for the operators

    --history Do not append to NetCDF "history" global attribute

    --netcdf_hdr_pad, --hdr_pad, --header_pad

    https://code.mpimet.mpg.de/projects/cdo/

  • Basic Syntax: cdo input-file [output-file]

    Information about operator: cdo -h

    User guide:

    Reference card:

    https://code.mpimet.mpg.de/projects/cdo/embedded/cdo.pdf

    https://code.mpimet.mpg.de/projects/cdo/embedded/cdo_refcard.pdf

    https://code.mpimet.mpg.de/projects/cdo/embedded/cdo.pdfhttps://code.mpimet.mpg.de/projects/cdo/embedded/cdo_refcard.pdf

  • OPERATORS OF CDO - FILE INFORMATIONOPERATORS OF CDO - FILE INFORMATIONShort summary of the whole dataset - sinfo:

    cdo sinfo

    [menz@login02: observation]> cdo sinfo tas_day_EWEMBI_19790101-20131231.nc

    File format : NetCDF4 classic

    -1 : Institut Source Steptype Levels Num Points Num Dtype : Parameter ID

    1 : unknown EartH2Observe instant 1 1 259200 1 F32 : -1

    Grid coordinates :

    1 : lonlat : points=259200 (720x360)

    lon : -179.75 to 179.75 by 0.5 degrees_east circular

    lat : -89.75 to 89.75 by 0.5 degrees_north

    Vertical coordinates :

    1 : surface : levels=1

    Time coordinate : 12784 steps

    RefTime = 1979-01-01 00:00:00 Units = days Calendar = standard

    YYYY-MM-DD hh:mm:ss YYYY-MM-DD hh:mm:ss YYYY-MM-DD hh:mm:ss YYYY-MM-DD hh:mm:ss

    1979-01-01 00:00:00 1979-01-02 00:00:00 1979-01-03 00:00:00 1979-01-04 00:00:00

    1979-01-05 00:00:00 1979-01-06 00:00:00 1979-01-07 00:00:00 1979-01-08 00:00:00

    1979-01-09 00:00:00 1979-01-10 00:00:00 1979-01-11 00:00:00 1979-01-12 00:00:00

    1979-01-13 00:00:00 1979-01-14 00:00:00 1979-01-15 00:00:00 1979-01-16 00:00:00

    1979-01-17 00:00:00 1979-01-18 00:00:00 1979-01-19 00:00:00 1979-01-20 00:00:00

    1979-01-21 00:00:00 1979-01-22 00:00:00 1979-01-23 00:00:00 1979-01-24 00:00:00

    1979-01-25 00:00:00 1979-01-26 00:00:00 1979-01-27 00:00:00 1979-01-28 00:00:00

    1979-01-29 00:00:00 1979-01-30 00:00:00 1979-01-31 00:00:00 1979-02-01 00:00:00

    1979-02-02 00:00:00 1979-02-03 00:00:00 1979-02-04 00:00:00 1979-02-05 00:00:00

    1979-02-06 00:00:00 1979-02-07 00:00:00 1979-02-08 00:00:00 1979-02-09 00:00:00

    1979-02-10 00:00:00 1979-02-11 00:00:00 1979-02-12 00:00:00 1979-02-13 00:00:00

    1979-02-14 00:00:00 1979-02-15 00:00:00 1979-02-16 00:00:00 1979-02-17 00:00:00

    1979-02-18 00:00:00 1979-02-19 00:00:00 1979-02-20 00:00:00 1979-02-21 00:00:00

    1979-02-22 00:00:00 1979-02-23 00:00:00 1979-02-24 00:00:00 1979-02-25 00:00:00

  • OPERATORS OF CDO - SELECTIONOPERATORS OF CDO - SELECTIONSelect variable: selname, selcode, selstdname, ...Select time: seldate, selyear, selseas, selmon, seltimestep, ...Select domain: sellonlatbox, selindexbox, selgridcellMask domain: maskregion, masklonlatbox, maskindexbox

  • OPERATORS OF CDO - GRID MANIPULAITONOPERATORS OF CDO - GRID MANIPULAITONGet description of the grid - griddes:

    cdo griddes

    Remapping of a dataset: cdo remap, Variants: remapbil, remapbic, remapnn, remapdis, remapcon, remapcon2, ...

    [menz@login02: observation]> cdo griddes tas_day_EWEMBI_19790101-20131231.nc

    #

    # gridID 1

    #

    gridtype = lonlat

    gridsize = 259200

    xname = lon

    xlongname = longitude

    xunits = degrees_east

    yname = lat

    ylongname = latitude

    yunits = degrees_north

    xsize = 720

    ysize = 360

    xfirst = -179.75

    xinc = 0.5

    yfirst = -89.75

    yinc = 0.5

    _cdo griddes: Processed 1 variable ( 0.01s )

  • OPERATORS OF CDO - ARITHMTIC AND STATISTICSOPERATORS OF CDO - ARITHMTIC AND STATISTICSArithmetics: add, sub, mul, div, min, max, ...Remove/add daily seasonal cycle: ymonsub, ymonadd, ydaysub, ydayadd, ...Multiply/divide with days per month: muldpm, divdpmTemporal, spatial and ensemble statistics:

    Temporal: tim, year, mon, seas, ...Spatial: fld, zon, merEnsemble: ensWith statistics : min, mean, max, sum, std, ...

  • OPERATORS OF CDO - MORE FEATURESOPERATORS OF CDO - MORE FEATURESChange values of grid, time or data variables and some attributes: setunits, chname,chunit, shi�time ...EOF analysisLinear regression in timeConversion into different formats (GRIB, ASCII, ...)Calculation of climate indices (e.g. consecutive dry days, frost days, summer days, ...)Basic visualization of dataopenMP support to run certain operations faster:

    cdo -P ...

  • VISUALIZATION OF NETCDF DATA: PANOPLYVISUALIZATION OF NETCDF DATA: PANOPLY

    Visualization of netCDF, HDF or GRIB dataFree tool from NASA that runs on java (Windows, Linux, MacOS)Various settings to change appearance of the plotSave plot to various formatsTutorials and Dokumentation:

    https://www.giss.nasa.gov/tools/panoply

    https://www.giss.nasa.gov/tools/panoply/help/

    https://www.giss.nasa.gov/tools/panoplyhttps://www.giss.nasa.gov/tools/panoply/help/

  • VISUALIZATION OF NETCDF DATA: PANOPLYVISUALIZATION OF NETCDF DATA: PANOPLY

  • VISUALIZATION OF NETCDF DATA: PANOPLYVISUALIZATION OF NETCDF DATA: PANOPLY

  • HANDS-ON:HANDS-ON:PRE-PROCESSING ANDPRE-PROCESSING AND

    VISUALIZATION OF NETCDF DATAVISUALIZATION OF NETCDF DATA