A Data Quality Screening Service for Remote Sensing Data

29
A Data Quality Screening Service for Remote Sensing Data Christopher Lynnes, NASA/GSFC Edward Olsen, NASA/JPL Peter Fox, RPI Bruce Vollmer, NASA/GSFC Robert Wolfe, NASA/GSFC Shahin Samadi, NASA/GSFC Contributions from: N. Most, Y.-I. Won, T. Hearty, R. Strub, S. Ahmad, S. Zednik, M. Hegde and A. Rezaiyan-Nojani Funded by: NASA’s Advancing Collaborative Connections for Earth System Science (ACCESS) Program

description

A Data Quality Screening Service for Remote Sensing Data. Christopher Lynnes, NASA/GSFC Edward Olsen, NASA/JPL Peter Fox, RPI Bruce Vollmer, NASA/GSFC Robert Wolfe, NASA/GSFC Shahin Samadi , NASA/GSFC. - PowerPoint PPT Presentation

Transcript of A Data Quality Screening Service for Remote Sensing Data

Page 1: A Data Quality Screening Service for Remote Sensing Data

A Data Quality Screening Service for Remote Sensing Data

Christopher Lynnes, NASA/GSFCEdward Olsen, NASA/JPL

Peter Fox, RPIBruce Vollmer, NASA/GSFCRobert Wolfe, NASA/GSFC

Shahin Samadi, NASA/GSFC

Contributions from: N. Most, Y.-I. Won, T. Hearty, R. Strub, S. Ahmad, S. Zednik, M. Hegde and A. Rezaiyan-Nojani

Funded by: NASA’s Advancing Collaborative Connections for Earth System Science (ACCESS) Program

Page 2: A Data Quality Screening Service for Remote Sensing Data

Outline

Satellite Data Quality: Why So Difficult?

Making Quality Information Usable via the Data Quality Screening Service (DQSS)

How does DQSS Scale?

Future Directions for DQSS

Page 3: A Data Quality Screening Service for Remote Sensing Data

Satellite Data Quality: Why So Difficult?

Page 4: A Data Quality Screening Service for Remote Sensing Data

Earth Observing

System

Satellite sensors collect data for for climate science research

Page 5: A Data Quality Screening Service for Remote Sensing Data

EOS satellite data is archived and distributed by a network of data

centers

Page 6: A Data Quality Screening Service for Remote Sensing Data

Satellite-borne collection breeds a tendency to keep low-quality data

around

Satellites are very expensive to launch and operate.

It takes years to understand the data characteristics.

Data are used for many different purposes.AIRS – Atmospheric Infrared Sounder• atmospheric temperature• atmospheric moisture • trace gas composition

Page 7: A Data Quality Screening Service for Remote Sensing Data

Sometimes there are a lot of low-quality pixels.

AIRS Parameter Best (%)

Good (%)

Do Not Use (%)

Total Precipitable Water

38 38 24

Carbon Monoxide 64 7 29Surface Temperature

5 44 51

Page 8: A Data Quality Screening Service for Remote Sensing Data

Quality schemes can be relatively simple…

Total Column Precipitable Water

Quality Level

Best Good Do Not Usekg/m2

Page 9: A Data Quality Screening Service for Remote Sensing Data

…or they can be fairly complicated

Hurricane Ike, viewed by the Atmospheric Infrared Sounder (AIRS)

PBest : Maximum pressure of “Best” quality values in

temperature profiles

Air Temperature at 300 mbar

Page 10: A Data Quality Screening Service for Remote Sensing Data

Quality flags are also sometimes packed together

into bytesCloud Mask Status Flag0=Undetermined1=Determined

Cloud Mask Cloudiness Flag0=Confident cloudy1=Probably cloudy2=Probably clear3=Confident clear

Day / Night Flag0=Night1=Day

Sunglint Flag0=Yes1=No

Snow/ Ice Flag0=Yes1=No

Surface Type Flag0=Ocean, deep lake/river1=Coast, shallow lake/river2=Desert3=Land

Bitfield arrangement for the Cloud_Mask_SDS variable in atmospheric products from Moderate Resolution Imaging

Spectroradiometer (MODIS)

Page 11: A Data Quality Screening Service for Remote Sensing Data

Current user scenarios...

Nominal scenario Search for and download data Locate documentation on handling quality Read & understand documentation on quality Write custom routine to filter out bad pixels

Equally likely scenario (especially in certain user communities) Search for and download data Assume that quality has a negligible effect

Repeat for

each user

Page 12: A Data Quality Screening Service for Remote Sensing Data

Using bad quality data is in general not

negligibleTotal Column Precipitable

WaterQuality

Best Good Do Not Usekg/m2

Page 13: A Data Quality Screening Service for Remote Sensing Data

Heavy cloud impairs retrievals

AIRS True Color, 2008-09-10

Page 14: A Data Quality Screening Service for Remote Sensing Data

Neglecting quality may introduce bias (a more subtle

effect)AIRS Relative Humidity Comparison against Dropsonde with and without Applying PBest Quality Flag Filtering

Page 15: A Data Quality Screening Service for Remote Sensing Data

Making Quality Information Usable via the

Data Quality Screening Service (DQSS)

.

Page 16: A Data Quality Screening Service for Remote Sensing Data

The DQSS filters out bad pixels for the user

Default user scenario Search for data Select science team recommendation for quality

screening (filtering) Download screened data

More advanced scenario Search for data Select custom quality screening parameters Download screened data

Page 17: A Data Quality Screening Service for Remote Sensing Data

DQSS segregates bad-quality pixels into a

separate arrayOriginal data array(Total column precipitable water)

Mask based on user criteria(Quality level

< 2)

Good quality data pixels

retained

Rejected data points

Output file has the same format and structure as the input file (except for the extra rejected-data fields)

Page 18: A Data Quality Screening Service for Remote Sensing Data

Visualizations help users see the effect of different quality filters

Best quality only

Best + Good quality

Page 19: A Data Quality Screening Service for Remote Sensing Data

DQSS encodes the science team recommendations on quality

screening

Straightforward: “Use Best-only for data assimilation uses; Use Best+Good for climatic studies”

More complicated: “Use only VeryGood over land; Use Marginal+Good+VeryGood over ocean”

Page 20: A Data Quality Screening Service for Remote Sensing Data

How Does DQSS Scale?

Page 21: A Data Quality Screening Service for Remote Sensing Data

Ontology-driven design helps scale with diversity of quality

schemes

Variable Types & Properties

Quality View

Page 22: A Data Quality Screening Service for Remote Sensing Data

The Quality View links data variables with quality

variables.

Page 23: A Data Quality Screening Service for Remote Sensing Data

The code queries the ontology to determine the type of quality variable needed and

selects the corresponding module to generate the quality mask.

Page 24: A Data Quality Screening Service for Remote Sensing Data

Execution at distributed data centers helps to scale with

demand

The jury is out as to whether Java performance will scale to meet demand.

Screening takes place at the data centers where the data reside to avoid excessive

data transfers

Page 25: A Data Quality Screening Service for Remote Sensing Data

Future directions

Page 26: A Data Quality Screening Service for Remote Sensing Data

Operationalize DQSS

Promotion to operations for AIRS data is scheduled for Aug 2010 DQSS will be offered through the main data

search interface at the data center

Usage Metrics will be collected: Basic usage What criteria are used

Screening invocation is a simple URL GET

What the “ yield” was for the screening

Page 27: A Data Quality Screening Service for Remote Sensing Data

Extend DQSS to Other Datasets

Moderate Resolution Imaging Spectroradiometer (MODIS)

Microwave Limb Sounder (MLS)

Ozone Monitoring Instrument (OMI)

Cloudsat

Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observation (CALIPSO)

Page 28: A Data Quality Screening Service for Remote Sensing Data

Collaborative Screening Use Case?

Define custom criteria

Tag custom criteria

Browse tagged criteria

Modify tagged criteria

Dr. Alice

Dr. Bob

Page 29: A Data Quality Screening Service for Remote Sensing Data

DQSS Recap

Screening satellite data currently is troublesome and time consuming for users

The Data Quality Screening System will provide an easy-to-use service

The result should be: More attention to quality on users’ part More accurate handling of quality information… …With less user effort