Forecast Verification and Evaluation · Forecast evaluation serves a number of important purposes....

2
Forecast Verification and Evaluation National Center for Atmospheric Research Verification The process of forecast verification/evaluation is often an unseen partner in engineering and scientific advances. Verification provides systematic and objective evaluation of the quality (or performance) of a forecasting system. In turn, this process allows fair comparisons to be made between forecasting systems. Diagnostics based on verification results provide information about a model’s strengths and weaknesses, and often have an impact on the direction of model improvements, depending on the focus of the evaluation (e.g., extreme precipitation vs. 500-mb height). In addition, forecast verification can help forecasters learn to improve their forecasts, and it provides information to users to aid in decision making. Verification results provide information about a model’s strengths and weaknesses, and often have an impact on the direction of model improvements. Purpose Forecast evaluation serves a number of important purposes. Each purpose is associated with a set of questions for which answers are desired. The major purposes and some representative (and important) questions associated with each include the following: continued on reverse side Forecast Observation MODE spatial verification tool in MET. Improvements can be made by evaluating forecast products throughout the development process as deficiencies in the algorithms are discovered. • FORECAST MONITORING AND IMPROVEMENT Are forecasts improving with time? Is a new version of a model better than the previous one? • FORECAST DEVELOPMENT AND DIAGNOSTICS Does a new forecasting module improve the forecasts from a numerical weather prediction (NWP) model? What are the biases in the model and how can they be corrected? Does the new module help forecasters make better predictions? • FORECAST BENEFITS AND VALUE Does the new version of the model (or a new forecasting capability) help people and organizations make better decisions? Tools The forecast verification experts in RAL’s Joint Numerical Testbed Program (JNTP) and the U.S. Developmental Testbed Center have developed a state-of-the-art verification package – the Model Evaluation Tools (MET) – that is fully supported, including email help, documentation, and tutorials. The MET software is completely configurable to meet the requirements for many different kinds of evaluations, including traditional verification approaches that have been used for decades to monitor forecast performance, as well as modern diagnostic tools that can be used to better understand errors in NWP and other forecasting systems. The MET package is freely available and is supported through an email help facility, complete documentation, and on-line and in-person tutorials. Staff at the JNTP continually update MET with new capabilities.

Transcript of Forecast Verification and Evaluation · Forecast evaluation serves a number of important purposes....

Page 1: Forecast Verification and Evaluation · Forecast evaluation serves a number of important purposes. Each purpose is associated with a set of questions for which answers are desired.

Forecast Verification and Evaluation

National Center for Atmospheric Research

VerificationThe process of forecast verification/evaluation is often an unseen partner in engineering and scientific advances. Verification provides systematic and objective evaluation of the quality (or performance) of a forecasting system. In turn, this process allows fair comparisons to be made between forecasting systems. Diagnostics based on verification results provide information about a model’s strengths and weaknesses, and often have an impact on the direction of model improvements, depending on the focus of the evaluation (e.g., extreme precipitation vs. 500-mb height). In addition, forecast verification can help forecasters learn to improve their forecasts, and it provides information to users to aid in decision making.

Verification results provide information about a model’s strengths and weaknesses, and often have an impact on the direction of model improvements.

PurposeForecast evaluation serves a number of important purposes. Each purpose is associated with a set of questions for which answers are desired. The major purposes and some representative (and important) questions associated with each include the following:

continued on reverse side

MODE: APCP_03 at A3 vs APCP_03 at A3Forecast Observation

12 3 4 567 8910

1112

1314

151617 18 1920212223 24

1 2 345678

910 11121314 1516

Fcst Obs Interest

22 14 0.965911 6 0.937713 8 0.927220 11 0.878512 8 0.86774 3 0.85318 3 0.8403

13 10 0.8374----------------------------------

10 3 0.617711 8 0.61257 2 0.611121 11 0.56877 5 0.548423 14 0.53917 1 0.538721 12 0.53299 3 0.5245

15 8 0.502919 11 0.484013 6 0.48047 6 0.4786

16 8 0.471412 6 0.464523 16 0.406920 12 0.402117 8 0.38487 4 0.3412

19 13 0.338920 13 0.314412 3 0.301211 2 0.283019 12 0.247021 13 0.241420 15 0.226022 10 0.215021 15 0.207514 9 0.193819 15 0.186018 9 0.183412 2 0.153911 1 0.15194 2 0.1484

22 16 0.143323 10 0.13915 3 0.1212

Forecast ObservationModel: p01-9km-stdField: APCP_03 APCP_03Level: A3 A3Units: kg/m^2 kg/m^2Initial: 20130521

00:00:002013052103:00:00

Valid: 2013052103:00:00

2013052103:00:00

Accum: 03:00:00 03:00:00

Centroid/Boundary: 2.00 4.00Convex Hull/Angle: 0.00 1.00Area/Intersection Area: 1.00 2.00Complexity/Intensity: 0.00 0.00Total Interest Thresh: 0.70

Forecast ObservationMask M/G/P: off/off/off off/off/offRaw Thresh: >=0.00 >=0.00Conv Radius: 15 gs 15 gsConv Thresh: >=2.54 >=2.54Area Thresh: >=0 gs >=0 gsInten Thresh: p100>=0.00 p100>=0.00Merge Thresh: >=1.25 >=1.25Merging: thresh threshMatching: match/mergeSimple/M/U: 24/12/12 16/6/10Area: 41571 gs 35847 gsArea M/U: 33948/7623 26306/9541Cluster: 4 4MMI: 0.5137 0.5435MMI (F+O): 0.5358

MODE spatial verification tool in MET.

Improvements can be made by evaluating forecast products throughout the development process as deficiencies in the

algorithms are discovered.

• FORECAST MONITORING AND IMPROVEMENTAre forecasts improving with time? Is a new version of a model better than the previous one?

• FORECAST DEVELOPMENT AND DIAGNOSTICSDoes a new forecasting module improve the forecasts from a numerical weather prediction (NWP) model? What are the biases in the model and how can they be corrected? Does the new module help forecasters make better predictions?

• FORECAST BENEFITS AND VALUEDoes the new version of the model (or a new forecasting capability) help people and organizations make better decisions?

ToolsThe forecast verification experts in RAL’s Joint Numerical Testbed Program (JNTP) and the U.S. Developmental Testbed Center have developed a state-of-the-art verification package – the Model Evaluation Tools (MET) – that is fully supported, including email help, documentation, and tutorials. The MET software is completely configurable to meet the requirements for many different kinds of evaluations, including traditional verification approaches that have been used for decades to monitor forecast performance, as well as modern diagnostic tools that can be used to better understand errors in NWP and other forecasting systems. The MET package is freely available and is supported through an email help facility, complete documentation, and on-line and in-person tutorials. Staff at the JNTP continually update MET with new capabilities.

Page 2: Forecast Verification and Evaluation · Forecast evaluation serves a number of important purposes. Each purpose is associated with a set of questions for which answers are desired.

The JNTP also provides support for a library of tools in the R language, which are useful for less extensive, non-operational studies. The R Verification Library is also freely available and documented.

User-Focused Approaches Scientists and engineers in the JNTP continually explore new ways to provide information about forecast performance that is more meaningful for forecast users. For example, new spatial verification approaches provide information about location and intensity errors in precipitation forecasts, rather than simply identifying the forecasts as “wrong” or “poor”. This information can help guide improvements in the forecasts and inform forecasters and other users about specific biases or errors that may impact decision making. Other examples of user-focused approaches include evaluation of forecasts of extreme weather (e.g., very strong winds) and examination of the performance of probabilistic forecasts. The JNTP develops tools for application of these methods (e.g., in MET) and undertakes research to improve the approaches. JNTP verification method development also focuses on specific forecast user applications, including hydrometeorology, climate, and renewable energy.

New spatial verification approaches help guide improvements in the forecasts and inform forecasters about specific biases or errors that may impact decision making.

Monitoring Forecast PerformanceOngoing testing and evaluation of forecast models and other kinds of forecasts is needed to ensure that forecast performance is consistent through time, and to assess improvements in performance that may be associated with new forecasting capabilities. Statistical confidence intervals (included in the MET software) make it possible to assess whether observed changes in performance are statistically meaningful. Graphical and tabular displays of verification results are critical for monitoring and communicating verification results; the JNTP develops many innovative capabilities for verification information communication that are available to the community.

For More Information, Contact:Tara Jensen National Center for Atmospheric ResearchResearch Applications LaboratoryPO Box 3000 Boulder CO 80307-3000303-497-8479 303-497-2729 [email protected] www.ral.ucar.edu

Diagnostic forecast information:Spatial distribution of surface temperature bias scores.

TrainingTraining on forecast verification methods and the interpretation of forecast verification statistics is a special focus of the JNTP. This training encompasses basic knowledge needed to design a verification study or system; guidance on interpretation of verification information; and information needed to apply the MET package. Both on-line and in-situ tutorials are available for MET. Tutorials on verification methods include consideration of the statistical basis for verification, as well as discussions of methods for categorical, continuous, and probabilistic forecasts. Special topics include spatial methods and confidence intervals.

SummaryTo ensure credibility, every activity focused on improving forecasting systems or on the provision of forecasts to users should have an associated verification activity to monitor the performance of the forecasting system and to assess the improvements in capabilities that are achieved. Comprehensive evaluations can lead to improvements in forecasting capabilities and the services provided to forecast users.

Comparison of performance of three model versions(surface temperature bias). Confidence intervals allow

assessment of differences in performance.