NSF Medium ITR

32
NSF Medium ITR Real-Time Mining of Integrated Weather Information Setup meeting (Aug. 30, 2002) [email protected]

description

Real-Time Mining of Integrated Weather Information Setup meeting (Aug. 30, 2002) [email protected]. NSF Medium ITR. Goals. Develop dynamic data mining applications (wherein information is extracted and provided to forecasters in real-time). - PowerPoint PPT Presentation

Transcript of NSF Medium ITR

Page 1: NSF Medium ITR

NSF Medium ITR

Real-Time Mining of Integrated Weather Information

Setup meeting (Aug. 30, 2002)

[email protected]

Page 2: NSF Medium ITR

Goals

Develop dynamic data mining applications (wherein information is extracted and provided to forecasters in real-time).

Develop applications of radar data to identify severe weather signatures in a probabilistic manner.

Build a prototype system so that these applications can be developed and tested on real-time and on archived data sets.

Page 3: NSF Medium ITR

Tasks

ProjectsDual-polarization algorithms

Clustering and Prediction

Vortex Identification

Areas of IT researchSVMs (identification & prediction), multivariate feature identification techniques, probabilistic feature extraction, high performance issues

I will talk about the tasks: we can decide the applicable areas as a group.

Page 4: NSF Medium ITR

Funding

Funded at 300K for the first year.

May get $650K over the next two years.

We need to show results at the end of the year, so it is good to know what the reviewers liked and did not like about our proposal.

Page 5: NSF Medium ITR

Negative Reviews (NSF)

Unfocused

No high-performance computing or numerical simulations

Real-time not explicitly defined

Budget way too high

No human-factors expertise

No details of how the approach could solve the problem.

Page 6: NSF Medium ITR

Reviewers liked these

Develop sensor compensation techniques for faulty sensors

Strong application focus on a complex domain

Experience with disseminating systems and WSR-88D algorithms

We seem to have been funded based on what we have done before, rather than on the merits of this particular proposal.

Page 7: NSF Medium ITR

From the 6th reviewer

Extend their previous working system (WDSS) with the following features:

integrating multiple sources of data

learning in real-time, thus improving the prediction capabilities

using statistics-based instead of heuristics-based decisions.

Use of these methodologies for teaching purposes, as well as the dissemination of this software to other research laboratories and the creation of a common research tool

Page 8: NSF Medium ITR

Also from the 6th reviewer

Could have been improved:the proposal seems to be an enumeration of different techniques, without any justification of why these methods have been chosen instead of other ones.

detailed explanations are sometimes missing.

My recommendation is to fund this proposal, but at a lower level than the one proposed by the investigators.

Page 9: NSF Medium ITR

Tasks

Three tasks:Vortex Detection

Clustering and prediction

Polarimetric Radar

Page 10: NSF Medium ITR

Real-TimeClassical: data periodicity (keep up with data).

Hard to define for multi-sensor applications

If you have a 3-radar domain, with a new elevation scan every 30 seconds, you get a new updated virtual volume on average every 10 seconds. Is periodicity 10 seconds?

Lightning strikes are essentially asynchronous.

Proposed: based on required lead-time

Example: average lead-time for a tornado warning is 11 minutes. We could set as a goal, predicting tornadoes 20 minutes into the future. If we can do it with data from 30 minutes ago, then, we have 10 minutes to process data.

Keep mind that the forecasts have to be continuous. We have to make runs once every 10 minutes.

Page 11: NSF Medium ITR

Task 1: Vortex Detection

At the end of this year, aim to have a vortex identification and prediction technique that:

Uses data from multiple sensors

Uses some novel data (more on this follows)

Accomodates for faulty information

Is capable of better skill than MDA/TDA

Is capable of providing more lead-time to a forecaster.

Decision Support System: provide forecaster with rationale for all suggested decisions.

Page 12: NSF Medium ITR

Current MDA/TDA

Mesocylone detection techniquefind 2D detections by analyzing azimuthal shear

associate them based on rank and time into 3D circulation features if they meet some strength thresholds

3D circulations that meet depth, base and strength criteria are classified as mesocyclones.

Page 13: NSF Medium ITR

Problems with current vortex algorithms

Defined on radial velocity field.

Single radar

Simple use of radar reflectivity (>0 dBZ)

Mesocyclone spatial extent based on radial velocity values, which are noisy

How can we improve it?

Page 14: NSF Medium ITR

Use of LLSD

One promising source of data is a linear least-squares fit of radial velocity in the neighborhood of a gate.

The size of the neighborhood depends on the range from the radar.

Fit to a linear combination of azimuth and range

Coefficient for azimuth is an estimate of azimuthal shear

Coefficient for range is the divergence.

Page 15: NSF Medium ITR

LLSD usage

Azimuthal shear field

Page 16: NSF Medium ITR

BoundariesTornadoes frequently happen at the boundaries between air masses

Not necessary

Image shows dry-line boundary

Image processing for boundaries to detect gust-fronts would be useful.

Page 17: NSF Medium ITR

Input Sources

The LLSD has never been used in vortex detection. Unlike the raw radial velocity, it can be combined from multiple radar.

Also have satellite data from spatial domain

Have national/region lightning data.

The Near Storm Environment (RUC model)

Still need to assimilate LLSD and reflectivity data from multiple radar in a fault-tolerant manner. (Can now do fault-tolerant time-based merges).

Page 18: NSF Medium ITR

Learning

Add a learning componentIncorporate warnings issued by forecaster into the learning by the algorithm.

Warnings can be faulty. Different forecasters have different skills. Therefore, this has to be achieved by the algorithm learning on the fly.

Validate the algorithm against storm reports. The verification data is noisy. Have to come up with robust ways of doing this verification.

Page 19: NSF Medium ITR

Data: status

The WDSS-II system already ingests radar data from multiple radars and national/regional lightning data.

Work is underway to ingest satellite data in real-time (archived cases can be done already).

We have archived warnings and RUC data since April of this year.

Currently testing process to compute LLSD at different scales.

RUC model data needs to be ingested.

Page 20: NSF Medium ITR

Discussion

What kinds of techniques are appropriate for vortex detection?

Multiple-sensor reflectivity, LLSD

RUC model data (in Lambert projection)

Multivariate analysis

Gust-front detection

Page 21: NSF Medium ITR

Task 2: Clustering and Prediction

Currently there are two ways to identify storms:Heuristic threshold-based technique that operates on radial reflectivity field.

Texture segmentation method.

Once identified, the storms are predicted by:Matching centroids of storms identified and linear extrapolation

Find motion estimate by minimizing mean absolute error on actual field. Then, forecast.

Page 22: NSF Medium ITR

SCIT / kmeans

The centroid and threshold-based technique called SCIT (storm cell identification and tracking) is used on the WSR-88D.

The texture segmentation and error-field minimization technique is being worked on.

I will show the results from the second technique because the first technique predicts only centroid location. (We want to do field forecasts).

Page 23: NSF Medium ITR

Kmeans

Page 24: NSF Medium ITR

Kmeans

These clusters are actually found at different scales.

The clusters are used as the domain within which the error minimization done (the kernel that is moved around in the previous frame).

And using these, a motion estimate (“wind field”) is obtained at different scales.

Page 25: NSF Medium ITR

Motion, Prediction

Page 26: NSF Medium ITR

Performance

Compared to a persistence forecast.

Skill at predicting the location of 30dBZ or higher values.

Clutter at the end of sequence. (Random data are assigned motion estimates)

Page 27: NSF Medium ITR

Ideas for future work

Drawbacks with current approach:Operates on radar or on satellite, not on both.

Can not handle faulty data (as with clutter)

Use multiple inputs in deriving motion estimates:Storm core movement (as the technique does)

Dual-doppler wind field retreival (?)

Wind-field estimates from mesoscale model (RUC)

Page 28: NSF Medium ITR

Discussion

Why go through wind-field estimate (and not directly to a forecast)?

To allow forecast of fields other than the input.

Physically reasonable assimilation.

Better ways of identifying storms.

Better ways of predicting location and values (field forecast).

Page 29: NSF Medium ITR

Task 3: Polarimetric Radar Algorithms

Essentially open field for research.

Currently only one AI algorithm: a hydrometeor classification algorithm.

Low-hanging fruit: a hail-size estimation technique.

Page 30: NSF Medium ITR

Hail Size Estimation

Currently done on Doppler radar (algorithm to compute field of hail size estimates in WDSS-II already).

High reflectivity data aloft are assumed to produce hail.

Polarimetric radar provides way of identifying hail near the surface (via aspect ratio).

Come up with way to estimate hail size.

Page 31: NSF Medium ITR

Learning

Train the technique on actual hail reports (which are noisy).

Problems with polarimetric radar include calibration errors. Techniques have to account for this.

Use the polarimetric hail-size estimation technique to improve the predicted hail-size from the Doppler-based method.

Page 32: NSF Medium ITR

Contacts

People at CIMMS/NSSL who can advise on each of these tasks:

Vortex Detection: Greg Stumpf, Travis Smith● [email protected][email protected]

Clustering/Prediction: V Lakshmanan, Bob Rabin● [email protected][email protected]

Polarimetric Radar: Terry Schuur● [email protected]