Barbara Brown 1 , Edward Tollerud 2 , Tara Jensen 1 , and Wallace Clark 2 1 NCAR, USA

NWP Verification with Shape-matching Algorithms:

Hydrologic Applications and Extension to Ensembles

Barbara Brown1, Edward Tollerud2, Tara Jensen1, and Wallace Clark2

1NCAR, USA2NOAA Earth System Research Laboratory, USA

[email protected]

ECAM/EMS 2011 14 September 2011

mailto:[email protected]

DTC and Testbed Collaborations

Developmental Testbed Center (DTC) Mission: Provide a bridge between the research and

operational communities to improve mesoscale NWP

Activities: Community support (e.g., access to operational models); Model testing and evaluation

Goals of interactions with other “testbeds”: Examine latest capabilities of high-resolution models Evaluate impacts of physics options New approaches for presenting and evaluating

forecasts

Testbed collaborations Hydrometeorological Testbed (HMT)

Evaluation of regional ensemble forecasts (including operational models) and global forecasts in western U.S. (California)

Winter precipitation Atmospheric Rivers

Hazardous Weather Testbed (HWT) Evaluation of storm scale ensemble

forecasts Late spring precipitation,

reflectivity, cloud top height Comparison of model capabilities

for high impact weather forecasts

Testbed Forecast Verification Observations

HMT: Gauges and Stage 4 gauge analysis HWT: NMQ 1-km radar and gauge analysis; radar

Traditional metrics RMSE, Bias, ME, POD, FAR, etc. Brier score, Reliability, ROC, etc.

Spatial approachesSpatial approaches are needed for evaluation of ensemble forecasts for same reasons as for non-probabilistic forecasts (“double penalty”, impact of small errors in timing and location etc.)

Neighborhood methods Method for Object-based Diagnostic Evaluation (MODE)

New Spatial Verification Approaches

NeighborhoodSuccessive smoothing of

forecasts/obsObject- and feature-

basedEvaluate attributes of

identifiable featuresScale separationMeasure scale-dependent errorField deformationMeasure distortion and

displacement (phase error) for whole field

Web site: http://www.ral.ucar.edu/projects/icp/

HMT: Standard Scores for Ensemble Inter-model QPF Comparisons

Example: RMSE results for December 2010

Dashed – HMT (WRF) ensemble members

Solid: Deterministic members

Black: Ens Mean

HMT Application: MODE

19 December 2010, 72-h forecast, Threshold for Precip > 0.25”

OBS Ens Mean Ens Mean

MODE Application to atmospheric rivers

• QPF vs. IWV and Vapor Transport

• Capture coastal strike timing and location

• Large impacts on precipitation in the California Coast and Coastal mountains=> Major flooding

impacts

Atmospheric rivers

Area=312Area=369 Area=306 Area=127

GFS Precipitable Water SSMI Integrated

Water Vapor

72 hr 48 hr 24 hr

HWT Example: Attribute Diagnostics for NWP Neighborhood & Object-based Methods - REFC > 30 dBZ

FSS = 0.14 FSS = 0.30 FSS = 0.64

Matched Interest: 0Area Ratio: n/aCentroid Distance: n/a• P90 Intensity Ratio:

n/a

Matched Interest: 0.89Area Ratio: 0.18Centroid Distance: 112kmP90 Intensity Ratio: 1.08

Matched Interest: 0.96Area Ratio: 0.53Centroid Distance: 92kmP90 Intensity Ratio: 1.04

Neighborhood Methodsprovide a sense of how model performs at different scales through Fraction Skill Score.

Object-Based MethodsProvide a sense of how forecast attributes compare with observed.

Includes a measure of overall matching skill, based on user-selected attributes

20-h 22-h 24-h

MODE application to HWT ensembles

RETOP

Observed CAPS PM Mean Radar Echo Tops (RETOP)

Applying spatial methods to ensemblesAs probabilities:

Areas do not have “shape” of precipitation areas; may “spread” the area

As mean: Area is not equivalent to any of the underlying ensemble members

Treatment of Spatial Ensemble Forecasts

Alternative:Consider

ensembles of “attributes”

Evaluate distributions of “attribute” errors

Example: MODE application to HMT ensemble members

Systematic microphysics impacts

3 Thompson Scheme members (circled) are: Less intense Larger areas

Note Heavy tails Non-symmetric

distributions for both size and intensity

(medians vs. averages)

90th p

erce

ntile

inte

nsity

Obj

ect a

rea

>6.35 >25,4Threshold

Probabilistic Fields (PQPF) and QPF Products

Prob APCP

QPEQPFPROBABILITY

Ens- 4km SREF - 32km 4km Nbrhd NAM-12km EnsMean-4km

50% Prob(APCP_06>25.4 mm) vs. QPE_06 >25.4 mm

Good Forecast withDisplacement

Error?

Traditional MetricsBrier Score: 0.07Area Under ROC: 0.62

Spatial MetricsCentroid Distance:Obj1) 200 kmObj2) 88km

Area Ratio:Obj1) 0.69Obj2) 0.65

1

2 Median Of MaxInterest: 0.77

Obj PODY: 0.72Obj FAR: 0.32

Summary

Evaluation of high-impact weather is moving toward use of spatial verification methods

Initial efforts in place to bring these methods forward for ensemble verification evaluation

MODE-based evaluations of AR objects

Spatial method motivation

Traditional approaches ignore spatial structure in many (most?) forecasts Spatial correlations

Small errors lead to poor scores (squared errors… smooth forecasts are rewarded)

Methods for evaluation are not diagnostic Same issues exist for ensemble forecasts

Forecast Observed

MODE example: 9 May 2011

Ensemble Workshop 2111 May 2011

MODE Example: combined objects

22

Consider and compare various attributes, such as:• Area• Location• Intensity distribution• Shape / Orientation• Overlap with obs• Measure of overall “fit”

to obs

Summarize distributions of attributes and differences

In some cases, conversion to probabilities may be informative

Spatial methods can be used for evaluation

Spatial attributes

Object intersection areas vs. lead time

Overall field comparison by MODE (“interest” summary) vs.

lead time

Barbara Brown 1 , Edward Tollerud 2 , Tara Jensen 1 , and Wallace Clark 2 1 NCAR, USA

Documents

Transcript of Barbara Brown 1 , Edward Tollerud 2 , Tara Jensen 1 , and Wallace Clark 2 1 NCAR, USA