WMO | Public Weather Services (PWS)
Public Weather Services (PWS) world map
Programmes > AMP > PWS Home > Quality assurance > Verification

Verification

The main purpose of public weather services is to provide to the public warnings, forecasts and other meteorological information, in support of safety of life and property, as well as for day to day convenience, in a timely and reliable manner. Consequently, any public weather service programme must include a system to evaluate whether this task is being fulfilled and to regularly assess the programme's performance. The aim of the evaluation is twofold: firstly, to ensure that products such as warnings and forecasts are accurate and skilful from a technical viewpoint and secondly, that they meet user requirements, and that users have a positive perception of, and are satisfied with the products.

The two evaluation components are essentially complementary, since even a highly accurate and skilful forecast will not produce an effective public weather services programme if it does not respond to user needs. The following deals only with verification, the evaluation component that seeks to track the accuracy, skill and timeliness of a forecast product.

The main goal of verification

The main goal of a verification process is to constantly improve the quality (skill and accuracy) of the services. This includes:

  • Establishment of a skill and accuracy reference against which subsequent changes in forecast procedures or the introduction of new technology can be measured;
  • Identification of the specific strengths and weaknesses in a forecaster's skills and the need for forecaster training and similar identification of a model's particular skills and the need for model improvement; and
  • Information to the management about a forecast programme's past and current level of skill to plan future improvements; information can be used in making decisions concerning the organisational structure, modernisation and restructuring of the national Meteorological Service (NMHS )

WMO survey results on NMHS verification programmes

According to a 1997 WMO global survey of Members' public weather services programmes, 57 percent of NMSHs had a formal verification programme. All NMHSs who responded indicated that they passed the results to staff and about a quarter submitted them to government authorities and other users. It would be desirable for a higher percentage of NMHSs to have a verification programme considering that a performance report based on verification results can increase the NMHS visibility and strengthen the confidence of users and authorities in its performance. At the same time, a verification process will contribute to the improvement of the quality of forecasts and warnings and a higher public approval rating will increase the confidence level of the NMHS staff. Those NMHSs that do not have an ongoing verification programme are strongly encouraged to implement such a programme as a matter of priority.

Benefits of Verification

  • It enables NMHSs to verify and track the accuracy, skill and timeliness of their forecasts and to make the appropriate improvements as required;
  • It identifies improvements in predictive skills resulting from investments in training, or new equipment such as radar, satellite ground stations or computer capacity;
  • It assists in making rational decisions concerning priority target areas for increased emphasis;
  • It provides a simple answer to the question from both staff and management about ''how good are we?'' and whether the NMHS is making the best use of science, technology and training in the end-to-end services process;
  • Making verification results available to staff immediately, helps to sharpen service accountability and effectiveness;
  • It helps to identify consistent errors and biases present in the forecast process, possibly unknown to forecasters;
  • It provides answers to questions concerning forecast accuracy from the public, media, major clients and decision makers and has the effect of promoting the NMHS credibility among these groups;
  • It supplies ready answers to funding agencies to justify proposed investments in the NMHS infrastructure, or as proof that investments have indeed provided improvements in skill and accuracy; and

Verification data are essential to the development of improved numerical and statistical forecast techniques, whose accuracy must exceed that of earlier or more subjective approaches.

Some characteristics of verification schemes

  • An operational verification system must be designed to collect and save the necessary data, verify the forecasts and distribute the results in a timely manner;
  • Forecast accuracy is determined by comparison of the disseminated forecast with actual observations, (by definition);
  • Forecast skill is determined by comparison of the disseminated forecast with a reference forecast, say persistence, climatology, or objective guidance, (by definition); it shows what ''value'' the forecast adds to simple schemes;
  • Forecasts in text form are more difficult to verify compared with forecasts in numerical form;
  • The scheme must be space (location), time and element specific;
  • Results must be relevant - do the verification statistics represent the specific information sought by managers?
  • The system must be ''fair'' and must be perceived to be ''fair'' by forecasters whose support is necessary for its success;
  • The system must be reasonable - is the added work on personnel at all levels justified, and do the verification statistics yield accurate and meaningful information?
Additionally:
  • Special verification systems can provide important information about the performance of a forecast system during specific events and will include a comparison of accuracy of the forecast versus lead time, false alarm ratio, frequency of the hazardous event and an assessment of the efficiency of the dissemination channels;
  • The verification system must meet the needs of all levels of the NMHS . The needs of a local verification system, where stress is placed on the improvement of the forecast, differ from those of the central system where a main goal may be to monitor the long-term improvement of a forecast system; and finally,
  • An overall evaluation system must contribute to the social or economic benefits of its clients and include an assessment of the value added to the client by the programme;

A few ''guiding principles'' on Verification

The following ''guiding principles'' are offered for developing a verification programme:

  • Choose weather elements and scores that are relevant to users' needs, including the more sophisticated users employing probabilistic information;
  • Use consistent elements, locations, methods and scores so trends can be tracked over time;
  • Keep the original raw data, so that if methods or scores change, the results can be recomputed back in time;
  • Verify for point locations if possible;
  • Know about the difference between measures of accuracy and skill and use these as appropriate;
  • Avoid using just a single score to measure overall performance - no one score can do this;
  • When designing schemes, take into account the strengths and weaknesses of objective versus subjective techniques, and where possible use objective measures (though subjective measures can have their place);
  • Automate if possible, but avoid using lack of automation as an excuse for not getting a verification scheme started;
  • Give immediate feedback on verifications to forecasters;
  • Be aware that compositing lots of information from different places or forecast periods may result in loss of useful information for improvements;
  • Be careful about comparisons with other areas in the same country or elsewhere with different climates - comparisons may be totally meaningless.

Some factors which determine the selection of forecasts and warnings for inclusion in a verification system

Detailed forecasts of specific weather elements are routinely disseminated to the public, aviation, agriculture, marine and other users. Several of these elements could be verified, such as: weather, maximum and minimum air and ground temperatures, cloud cover and ceiling, wind speed and direction, amount and probability of precipitation or duration of sunshine.

A single score cannot provide a complete picture about the skill and accuracy of a particular set of forecasts. On the other hand, it is not possible to verify everything since the verification system should not be too cumbersome. Balance has to be found and a practical compromise is to compute a few meaningful scores that address the specific purposes of the verification system.

Other factors that determine the selection of forecasts and elements that are appropriate for inclusion in a verification programme are:

  • Availability of both the forecast and observations which verify the particular element. Often this is a problem in data sparse regions. In this case subjective verification, though undesirable, is better than none, especially if done consistently;
  • When both forecasts and observations are available, they must have a space-time correspondence; each forecast must be verified against an observation made at a point, not against ''area weather'';
  • Observations are instantaneous measurements of current weather conditions and may not represent the prevailing conditions during the forecast period; a satisfactory verifying solution could then be elusive;
  • The changes of weather conditions specified during the validity period of an aviation terminal forecast may be difficult to fully verify because the fixed time observation may not fall within the valid period of the forecast segments;

In practice, however, some forecasts for some elements are valid for periods of time and are easy to verify; examples are total precipitation for the day and hours of sunshine.

Data collection and quality control

As a preliminary to the verification process, data must be collected from various sources, forecasts and their verifying observations must be matched and be subjected to thorough checks for errors.

Data collection

Data collection could be done simply by manually entering the forecast and observation values in verification forms that are then sent to the data processing site. Or the data could be electronically transmitted after being entered by the forecaster or the observer. In a highly automated data collection system the data are retrieved and transmitted to appropriate destinations rapidly, but there is the inherent risk of neglect of good quality control.

Forecasts and observation data should be collected on a regular basis with reasonable imposed deadlines. For highly automated systems this could be moments after observation times, but for a system that uses verification forms, collection should be a few times monthly. In the case of scoring systems (see below), collection is easier and should be done much more frequently.

Quality control

All data should be checked for errors prior to calculation of verification statistics. A simple quality control method is for the forecaster to check and manually correct the data. Automated error-checking is more complex though not necessarily better since small errors can go easily undetected. Sometimes too, if quality control measures are too strict, automated procedures may eliminate highly unseasonable or rare and extreme values that are correct. These data come from some of the most critical forecast situations and special care should be made to preserve them in the verification sample.

Quality control is most effective when done at the data source or local level, by forecasters who are most familiar with daily weather at their specific sites and can better identify errors. In addition to demonstrating trust in the forecaster by having him check his own work, local quality control can add forecaster acceptance of verification by instilling confidence in the accuracy of the results, and by virtue of participation in the scheme. Quality control is also done at the central level so as to identify errors that were missed at the local level, the results being shared with the local level in order to assist in, and also to improve quality control at the source.

Scoring Systems

Verification schemes need not be elaborate or complicated, and in fact could be disarmingly simple. Instead of calculating the deviation of the forecast from the actual observation and employing heavy statistics, a scoring system awards points to every feature that was forecast correctly. That way it is easy to see the quality of the forecast as well as the trend in the overall performance. The allocation of points will reflect an understanding of what the public cares about the most- will it rain or not, will it be sunny or not, and what will the wind be like.

Points are awarded on the basis of how close the forecast value is to the observed value of an element, with a lower score for close (partly correct), and a higher score for closer (correct). No points are given if there is no agreement. The scoring is subjective. However, in a simple scoring scheme used for verification in New Zealand, it was found that different people tend to give similar scores to the same combination of forecast and observed weather.

The New Zealand verification scheme allows a maximum of eight points for each forecast for a 12-hour period, (say 6am to 6pm today), as follows:

  • four points for precipitation
  • two points for cloud cover
  • A point each for wind direction and speed.

The precipitation scoring takes into account the intensity and duration of the rain and allows four points for close agreement between forecast and observation. No points are given if there was no agreement, say if rain was forecast and none fell. Scores in between are given for partial agreement. Cloud cover score is taken as a maximum of two if the forecast was for cloudy skies or implicitly cloudy as in a rain forecast. A point is given for mostly cloudy or mostly clear skies when the forecast was for cloudy part of the time. The wind direction score is taken as one point if the observation was within one compass point of the forecast and as half a point if it was within two compass points. The wind speed score is taken as one point if the forecast and observed winds were within one Beaufort Force during the 12-hour period and as half a point if they were within two categories of Beaufort Force.

Another simple approach to forecast verification is a method that can be applied to the forecast of rain days for a month. It counts forecasts for rain or no rain and compares them with observations, allowing various accuracy and skill measures to be calculated, the most important of which is the percentage correct of all forecasts.

Statistics

The basic approach for technical verification of weather products is to compare statistics. Statistics might, for example, be calculated by applying statistical tools such as bias, mean absolute error or variances to a number of places in the forecast area, and a number of forecast features such as maximum temperature and probability of precipitation. Verification results can be compared to persistence, objective guidance, climatology and to former results. Once verification has started, it is a good idea to stick to the same methodology so that performance can be tracked over a long period of time, without complications of changed methodologies.

The quality of products will change little unless there is change in forecasting techniques or technology. The introduction of new technology, forecaster recruits or training upgrades during the verification period cause a few complications as the quality of the products might change and it will be difficult to isolate cause and effect related to the change. Statistics before and after their introduction will be needed so as to be able to distinguish their impact on the forecasting system.

The use of verification results

Verification results can reveal important information about the overall skill, as well as the specific weaknesses of the forecast system. These results should be first shared internally with staff who should understand that their output is important enough to be checked and monitored. Other interested parties who should be the recipients of the results are the research division, the NMHS management, institutions, stakeholders, government agencies and the public. The public should receive only condensed, relevant, and interpreted information arising from the results.

Verification programme is not just an exercise in data collection and processing. Most importantly for everyone concerned, especially management, is to be prepared to take prompt action on the results.

For more detailed discussions please refer to WMO / TD NO. 1023.

 
top
Top

line

© World Meteorological Organization, 7bis, avenue de la Paix, Case postale No. 2300, CH-1211 Geneva 2, Switzerland - Tel.: +41(0)22 730 81 11 - Fax: +41(0)22 730 81 81 Contact us Copyright | Privacy | Scams | Disclaimer |  Guidelines | Procurement | UN System | Accessibility