www-txt.gif (7823 bytes)

 

WORLD METEOROLOGICAL ORGANIZATION

Commission for Basic Systems (CBS)

 

 

STUDY OF THE IMPACT OF THE LOSS OF RUSSIAN FEDERATION RAOBS ON NWP VERIFICATION STATISTICS IN THE
NORTHERN HEMISPHERE

(30 September 2000 )


- FINAL REPORT -

 

 

 

 

 

submitted to:

CBS Open Program Area Group
On Data Processing And Forecasting Systems

 

 

 

 

 

 

 

 

submitted by:

Mr Terry Hart
Chair
Expert Team to Evaluate the Impact of Changes in the GOS on the GDPS


FOREWORD

The Global Observing System (GOS) and the Global Data Processing System (GDPS) are inexorably linked. Without observations the GDPS cannot function. Without an effective forecast system, observations have only minimal value. Any change in the GOS, therefore, is of immediate concern to those relying on the observations. In the forefront of those concerned are the people who operate and manage the GDPS.

Both systems are dynamic, with large changes occurring regularly. Monitoring those changes and determining the impact change has on the two systems is a major and continuing activity. The WMO has established procedures for monitoring the performance of the GOS by examining the reception of reports, their coverage in time and space, and their quality. In a parallel fashion, procedures have been established for the GDPS to collect, disseminate and review verification statistics from the Numerical Weather Prediction (NWP) models. These are two ways by which the performance of the GOS and GDPS are monitored independently. There is no comparable method for routinely monitoring the impacts that changes in the GOS have on the GDPS.

At times, the impact on the GDPS of a change in the GOS is thought to be so large that in-depth and systematic studies are needed. One example of such studies is an Observing System Experiment (OSE). An OSE is intended to examine the impact of the loss of existing data on NWP forecasts and analyses. If the data do not yet exist, such as those expected from a future satellite instrument, then it is necessary to simulate the impact of the expected data on the NWP. Such studies using simulated data are termed Observing System Simulation Experiments (OSSEs).

Both OSEs and OSSEs require substantial resources for their planning and conduct. The results from both have been demonstrated to provide valuable information on the impact that changes in the GOS have on the GDPS. Their cost and time to complete, however, has often made it difficult or impossible to examine significant changes that should be studied. Hence, both OSEs and OSSEs are conducted on an a-periodic basis. It was thought that a simpler and timelier technique was needed to complement the use of OSEs and OSSEs.

Comparison of verification statistics before and after a major change in the GOS was proposed as one possibility. Two changes in the GOS provided the opportunity to test this proposal. The first was the loss of the NOAA-11 TOVS instrument in February 1999, which resulted in a 50% loss of satellite sounding data disseminated by the US. The second was the substantial loss (65-70%) of Russian Federation RAOB sites during the period 1995-2000.

The CBS Open Program Area Group on the Data Processing and Forecasting Systems chartered The Expert Team to Evaluate the Impact of Changes of the GOS on the GDPS. Results of the NOAA-11 TOVS case study were reported at the first meeting of the Expert Team in March 2000. Those results were favorable enough to encourage the Expert Team to undertake a study of the Russian Federation RAOB loss using verification statistics. This document is the report of that effort.

The Expert Team wishes to express its sincere appreciation to the people at the WMO Centres in Japan, Canada, United States, United Kingdom, France, and at the European Center for Medium Range Weather Forecasts (ECMWF) who supported this work, and who provided the basic information analyzed by the Team. We also wish to acknowledge the excellent support provided by the WMO, especially Mr. Dieter Schiessl and Mr. Morrison Malki of the World Weather Watch Department. Finally, let me express my personal appreciation for the substantial effort and insightful comments of the Team Members and Associates who actually did the work to make this report possible.

Submitted to the Chair, CBS Open Program Area Group on Data Processing and Forecasting Systems, 30 September 2000, Mr. Terry Hart, Chair, Expert Team to Evaluate the Impact of Changes of the GOS on the GDPS


EXECUTIVE SUMMARY

Status of Russian Federation RAOBS and Impact of Loss

The number of sites in the Russian Federation between 60E and 175E taking upper air observations using radiosondes (RAOBs) with data available at the U.S. National Centers for Environmental Prediction (US-NCEP) decreased from 76 stations in January 1994 to 23 stations in December 1999.

For the region 40E to 130E, The Met. Office (UK) found a decrease from 196 to 65 stations for the same time period. Large gaps in the network appeared in the northern and central eastern regions of the Russian Federation. In general, the southern portion of the RAOB network has been best preserved. The decrease is due to several reasons, including budget restrictions.

Such a loss of some 65% to 70% of the regularly reporting stations in a region where the RAOB network was already sparse would be expected to affect the skill of Numerical Weather Prediction (NWP) forecasts. One would expect that forecast verifications at WMO Centers in Japan, Canada, the United States, the United Kingdom, and France, and the ECMWF would be impacted negatively.

Organization of the Study

The 9th Session of the CBS Working Group on the Global Data Processing System formed a sub-group (November 1997) to assess the impacts of changes to the GOS, convened by Mr. G. Verner (Canadian Meteorological Center). With the restructuring of the CBS in 1998, the work was subsumed into the new Open Program Area Group structure with the task:

"To plan and co-ordinate analyses of the NWP verification exchanged by the GDPS and use them as appropriate to develop methods and procedures for assessing planned and unplanned changes to the GOS on the operation of the GDPS."

The Open Program Area Group on Data Processing and Forecasting Systems formed the Expert Team To Evaluate The Impact Of Changes Of the GOS On GDPS (hereafter termed Expert Team). The first meeting of the Expert Team was held at Meteo-France, Toulouse, France, from 9 to 11 March 2000. The Expert Team has as a major focus the development of methods to assess the impact of changes in the GOS on the GDPS. In particular, the Expert Team was asked to assess the possibility of using verification statistics as an alternative to conducting the more costly and time-consuming Observing System Experiments (OSEs).

The terms of reference for the Expert Team were to provide two deliverables:

1.    A report on the possibility of using NWP verification statistics to assess the impact (positive or negative) of changes to the GOS on the operation of the GDPS, particularly NWP, and

2.    A set of guidelines to be used where action is required to minimize the impact of a loss of observations on the operation of the GDPS.

The guidelines requested were prepared during the March 2000 meeting and became section 4.0 of the final report of that meeting. This document is the report on verification statistics.

Verification Statistics

Forecast verification scores are exchanged routinely between NWP Centres. These verification statistics, such as mean errors and RMS errors, can be calculated as a comparison of the forecasts against observations, or forecasts against analyses, all valid at the same. In either case, verification statistics are available for temperature, height and vector wind, at standard levels, for forecast periods out to at least T+72 hours. Some verification statistics are available for longer forecast periods, for example, out to T+168 hours

In addition to the mean and RMS errors, verification statistics in the form of Anomaly Correlations can be calculated. The mean is subtracted independently from both the forecast and verifying analysis, and the auto-correlations calculated between the two residual fields.

Working Hypothesis and Expert Team Approach

The Expert Team adopted the following working hypothesis:

"It is possible to establish a meaningful impact of reductions in the radiosonde network in Russia through evaluation of readily available verification scores of global and/or regional scale operational forecast models"

"Meaningful" implies finding a statistically and meteorologically significant signal above the noise expected from variability in skill as a function of, for example, seasonal trends, circulation regime, predictability considerations, changes in data assimilation systems, model changes, and general data access, availability, and quality control issues.

The Expert Team agreed to examine a limited number of fields as the first test of the hypothesis. These were organized in two sets of data analyses:

1. Verification Against Sondes

The forecast values at various verifying times (e.g., T+48, T+72, and T+168) were compared to the RAOB observations at the same verifying times.

2. Verification Against NWP Analyses

The forecast values at various verifying times (e.g., T+48, T+72, and T+168) were compared to the NWP analyses at the same verifying times.

- For both sets of data analyses above, the following guidelines were used:

- The period examined was from January 1994 to December 1999,
- RMS and mean errors of 500 hPa heights and 250 hPa vector winds,
- The areas of concern were Asia, North America, and Europe/North Africa,
- For both 00 and 12 UTC, and
- For forecast periods of T+24, T+72, T+120 and T+168.

Each center participating provided the information extracted from its operational verification databases. These are the data routinely exchanged over the GTS. The participating centers were JMA (Japan), CMC (Canada), NCEP (U.S.), The Met. Office, European Center for Medium-range Weather Forecasts (ECMWF, and Meteo France.

Each center’s data were circulated amongst the participants who provided comments. The final report is based on a consensus of the comments from the Expert Team.

It is important to point out that, given the time constraints, the concept of "meaningful impact", was not explored fully. The examination of the statistics was constrained to the subjective appraisal of the available data. No systematic stratification of data (e.g., by circulation regime) was done, nor were quantitative measures of statistical significance or signal/noise ratios calculated. This may very well have reduced significantly the chances for finding evidence of "meaningful impact" in support of the hypothesis. Nevertheless, it was believed that the effort would be valuable regardless of the result, if for no other reason than to determine what cannot be done to examine impact.

Data Analysis and Results

The following table shows a summary of the kinds of verification statistics available for use in the study. The table is not exhaustive as there were too many different combinations of fields (wind and temperature), for time periods out to T+168, for different regions (Asia, North America, and Europe) to be exhaustive. The table does indicate, however, the extent of the verification statistics that were used, and the range of models and verification techniques included in the study.

Table: Summary of Verification Statistics Available

Country

Summary

UK

2/day plots from 1996.

Verification vs. sondes: mean & RMS for height, T, Wind; for any standard pressure level; for any CBS area. Correlation statistics using sondes are not available.

Verification vs. Analyses: Same as above.

US

Daily plots from 1997. Verifications for NCEP MRF model for 500 hPa, for October-March for North America and Europe. Monthly mean scores for Jan., Feb., and Mar., 1996-1998.

Canada

Daily scores from 1997 for N. America at 00 UTC and 12 UTC, for 24, 48, and 72-hour forecasts for 500 and 250 hPa. Proposed full set of statistics to be exchanged rather than plots. Excel spread sheet sent to all of verification scores 24-48-72 hr RMS for CMC global model since 1 Jan 1994 over the N. American area. Also, 24-48 scores for the CMC regional model. Provided tendency correlation scores.

France

Daily scores for global ARPEGE model for 2.5 and 1.5 degree resolution vs. RAOBS over any WMO domain; ARPEGE initialized analyses; and ECMWF 4DVAR fields over Europe N of 20 deg N.

Japan

RSME of 500 hPa for JMA Global model in Jan 1999 verified against sonde data. RSME for 500 hPa for Asia, N. America, and Europe mean errors for periods of T+24, T+48 and T+72, March 1996 to June 2000.

ECMWF

For each year, 1995-1999, height and wind verifications (observations vs. first guess) by RAOB station, indicating the number of observations in the data sample. Also an ASCII file on the reporting statistics for each station.

Australia

Provided an analysis of JMA tabular data. Computed mean monthly values and prepared time series of the data.

 

 

 

 

 

 

 

 

 

 

 

 

 

Verifications Against Sondes

The following figure shows the JMA verification scores in the JMA regional model (48N, 92E; 55N, 170E; 14N, 112E; 18N, 150E) as a time series plotted against the availability at ECMWF of Russian RAOBS at 00 UT for the period January 1995 to May 2000. The scores are verifications vs. sondes, and are shown in RMS errors of the height (meters) of the 500-hPa surfaces.

The deterioration in the number of the RAOBS is clearly shown, but there is no corresponding deterioration in the scores. The graphs in the figure echo the composite results of verifications vs. sonde for all the centers.

Considering the sum of all of the results, the Expert Team concludes that it was not possible to correlate the reduction in the Russian Federation RAOBS with deterioration in the verification scores when the forecast is compared with observations at the verification time. Therefore, the hypothesis is not proven for the case of verifications vs. sondes.

This result is true for each of the centers participating and for most forecast periods, especially those out to T+120 and T+168. Some possible reasons why such a correlation was not found when there was a significant decrease in the number of RAOB sites are discussed in a subsequent section.

 

 

Verifications Against Analyses

Verifications against analyses were examined from several Centres for 500 hPa geopotential height and 250 hPa vector winds. The following two figures from The Met. Office show that the peak RMS errors for 500 hPa height during the winter decreased by about 70-80 meters over the period 1996 to 1999 for forecast periods of T+120 and T+144. Considering both Anomaly Correlations provided by US-NCEP (subtracting mean values individually from both the forecast and analysis fields, and computing correlations between the two residual fields), and daily variations in verifications provided by CMC, no clear signal was found. Inferences could be made that some deterioration had occurred in some instances, but it was not possible to quantify the impact or to make attribution of the cause.

 

As with the verifications vs. sondes, the Expert Team concludes that it was not possible to correlate the reduction in the Russian Federation RAOBS with any deterioration in the verification scores when the forecast is compared with analyses at the verification time. Therefore, the hypothesis is not proven for the case of verifications vs. analyses.

Summary

1. Given system changes and perceived natural variations in predictability it is difficult to detect easily any clear signal in the CBS verification statistics, and even more difficult to unequivocally assign any change to a particular cause. That does not mean, however, there is no signal. Over regions such as Asia and North America there are some indications of degradation in the CBS scores. Further, there is no clear sign of improvement as might have been expected from other observation, analysis scheme and model improvements. An improvement should have been expected, although natural variability is a major factor to consider as well.

2. The lack of impact could mean several things:

a. The Russian Federation has managed the change in the network very well for the purposes of the large-scale NWP. There was no signal to be found because the minimum number of RAOBS needed on the synoptic scale was preserved. (Regional models were not investigated).

b. The verification statistics were not sensitive to the change because other signals were stronger.

c. The methodology in analyzing the information was not correct, or incomplete. It is important to point out that, given the time constraints, the concept of "meaningful impact", was not explored fully. The examination of the statistics was constrained to the subjective appraisal of the available data. No systematic stratification of data (e.g., by circulation regime) was done, nor were quantitative measures of statistical significance or signal/noise ratios calculated. This may very well have reduced significantly the chances for finding evidence of "meaningful impact" in support of the hypothesis. Nevertheless, it was believed that the effort was valuable regardless of the result, if for no other reason than to determine what cannot be done to examine impact.

  1. Changes in the model physics, data assimilation, and/or mix of observing systems masked the impact of the loss of the RAOB stations. As noted above, the sum of these changes should have resulted in a decrease in the errors. Such a decrease, however, was not generally noted.
  2. The verification regimes may not have been well suited or best located to detect any impact. The region of most data loss was mainly north of 50N, whereas the regions considered in the study included a broad latitude range (e.g., Asia, North America) or were centered at lower latitudes (e.g., the JMA RSM region).

3. The large differences in sample size (i.e., number and coverage of RAOB stations) between 1996 and 1999 may have affected the scores. Eliminating the data from stations in regions of high variability may affect the scores, regardless of the reason why the data were not used in the assimilation (e.g., station closure or failure to pass quality control checks).

4. A number of the stations closed had the lowest quality measurements, and may have been having a negative impact on the verifications. Eliminating those stations could have raised the verification scores.

5. Examining verification against analyses using the CBS defined statistics are only computed for the near-hemispheric scale, or the broad tropical band. The use of regional models and verifications against analyses over smaller areas may have shown a different picture, but information was not available to examine this aspect.

6. Verification against sondes in the CBS statistics for the Asian region may have been affected by changes in the list of stations included in the verification. The Expert Team did not investigate this aspect.

7. Despite not finding a clear signal in the statistics, some degradation was noted. For example, some of the T+72 and T+120 RMS and RMSE vector wind results from The Met. Office show a slight deterioration in forecast performance for 1998 and 1999. France also reported a slight decrease over the same period. The statements are heavily qualified, however, as changes in such things as predictability due to the assimilation of data could be responsible. Meteo France, for example, had difficulties during this period in the assimilation of satellite winds.

8. There are further indications of degradation in verification statistics additional to those exchanged within the CBS system, such as the peak errors ("poor forecasts") registered in the daily values for the Asia and North America regions, and in a verification area closer to the study area. Again, the results are indicative but not conclusive.

  1. The impact of the reduced network is clearly evident, however, in observation increment statistics for the remaining stations. These show that the quality of the first guess fields has deteriorated over the period examined. In the absence of substitute observations, the analyses between the remaining stations probably have deteriorated as well.

Conclusions

  1. The hypothesis was not proved in this study for either the case of verifications vs. sondes, or verifications vs. analyses. Neither was the hypothesis disproved. It is not useful, in the opinion of the Expert Team, to attempt to use a subjective and cursory analysis of CBS verification statistics to examine the impact that the loss of observational data might have on NWP, when these changes are confined geographically or occur over a limited period. The interactions of the elements that contribute to an impact, as well as the factors that influence one’s ability to discern that impact, are much too complex to be examined by this simplistic approach.
  2. The importance of the GOS in NWP performance is not in question. The main question is how one determines the impact of a particular set of observations or system. The GOS has already developed to a point where there are several observing systems providing similar data. Such apparent redundancy, and the many factors that affect NWP performance, make it difficult to find a signal associated with any one system. This is especially true when the studies are subjective and cover a relatively short period of time.
  3. If routine verification scores are, in fact, to be useful in assessing data impacts, the Expert Team concludes that a much more rigorous approach must be used. Such an approach should include, as a minimum, such things as the systematic stratification of data (e.g., by circulation regime), and the calculation of quantitative measures, including statistical significance and signal-to-noise ratios. Further, a more specific focus on particular synoptic situations or severe weather events probably will provide a more valuable measure of impact rather than verification statistics averaged over large areas and extended periods. The development along these lines of the capability to conduct impact studies based on verification statistics would require putting in place a suitable infrastructure at a designated center or centers. A substantial effort would be required to define the scientific and analytical processes required to build that infrastructure. The Team believes that there is considerable expectation that such an effort would be productive, and it recommends that the matter be pursued further.
  4. The Expert Team recognized that the use of verification statistics, even if found to be useful and feasible, will not answer many of the questions concerning impact of observing systems on the GDPS. OSEs have been used successfully in the past and the Expert Team concludes that they have a definite role to play in future impact studies. The Expert Team, therefore, endorses the NCEP plan to conduct an OSE using its Reanalysis system and data set to test further the hypothesis that the loss of the RAOB data from the Russian Federation does have an impact on the quality of the NWP analyses and forecasts.
  5. The Expert Team further concludes that mechanisms need to be established for more timely assessment of changes in the GOS that might affect the quality of the NWP analyses and forecasts. This requires a proactive approach wherein procedures, infrastructure, and resources are identified and in place to address issues as they arise
  6. Finally, the Open Program Area Group on Data Processing and Forecasting Systems is invited to review the results of the pending NCEP-OSE, and as appropriate in light of the results;
    1. Reconsider the potential for using the verification scores in assessing the impact of changes in the GOS, and
    2. Consider recommending any changes in the verification procedures that may assist in the use of the verification scores for assessing impact.

 

 

 

STUDY OF THE IMPACT OF THE LOSS OF

RUSSIAN FEDERATION RAOBS ON NWP VERIFICATION STATISTICS IN THE NORTHERN HEMISPHERE

  1. BACKGROUND FOR THE STUDY
    1. Status of Russian Federation RAOBS
    2. The number of sites in the Russian Federation between 60E and 175E taking upper air observations using radiosondes (RAOBs) with data available at the U.S. National Centers for Environmental Prediction (US-NCEP) decreased from 76 stations in January 1994 to 23 stations in December 1999. (See Figs. 1and 2)

       

       

       

       

       

       

       

       

       

       

       

       

       

       

       

       

       

       

       

       

       

       

       

      For the region 40E to 130E, The Met. Office found a decrease from 196 to 65 stations for the same time period. Large gaps in the network appeared in the northern and central eastern regions of the Russian Federation. It is noteworthy that the spatial density of stations east of 60E was lower than over Europe even before the reductions in the network. In general, the southern portion of the RAOB network has been best preserved. The decrease is due to several reasons, including budget restrictions. During the early part of CY 2000, several of the remaining stations began to take 4/day observations. This is in contrast to the reduction in the overall number of active RAOB stations and is unlikely to offset the total loss of data from so many stations.

    3. Possible Impact of Reduced RAOBS
    4. Such a loss of some 65% to 70% of the regularly reporting stations in a region where the RAOB network was already sparse would be expected to affect the skill of Numerical Weather Prediction (NWP) forecasts. One would expect that forecast verifications at WMO Centers in Japan, Canada, the United States, the United Kingdom, and France, and the ECMWF would be impacted negatively. Japan, for example, might experience decreased accuracy in forecasts out to 72 or 120 hours, while the other centers could experience deterioration in longer-term forecasts out to 168 hours, and possibly beyond.

      Of course, the RAOB data are used extensively for purposes other than NWP. The information is used directly at forecast offices for a wide variety of user products including, amongst others, those for aviation, marine, severe weather, agriculture, and general public forecasts. Climate applications and analyses are particularly sensitive to such major changes in the RAOB network, especially in northern regions where the network was lower than needed before the reductions.

      For all these reasons, then, the change in the Russian Federation RAOB network is considered to be of vital concern to the proper functioning of the Global Observing System (GOS).

    5. WMO Response

    One aspect of the 1998 reorganization of the Commission for Basic Systems (ref. 1) was the creation of the Open Program Area Group on Integrated Observing Systems (CBS/OPAG/IOS). At its Second Session (29 November to 3 December 1999) (ref. 2), the CBS/OPAG/IOS Expert Team on Observational Data Requirements of the Global Observing System noted the general deterioration in observing systems. It also noted the impact that this deterioration and other changes in the Global Observing System (GOS) might have on the Global Data Processing System (GDPS). Several mechanisms were suggested to study the impact including Observing System Experiments (OSEs) and Observing System Simulation Experiments (OSSEs). (See Annex 2 for additional information.)

    The 9th Session of the CBS Working Group on the Global Data Processing System formed a sub-group (November 1997) (ref. 3) to assess the impacts of changes to the GOS, convened by Mr. G. Verner (Canadian Meteorological Center). With the restructuring of the CBS in 1998, the work was subsumed into the new Open Program Area Group structure with the task:

    "To plan and co-ordinate analyses of the NWP verification exchanged by the GDPS and use them as appropriate to develop methods and procedures for assessing planned and unplanned changes to the GOS on the operation of the GDPS."

  2. ORGANIZATION OF THE STUDY
    1. Terms of Reference for the Expert Team
    2. The Open Program Area Group on Data Processing and Forecasting Systems formed the Expert Team To Evaluate The Impact Of Changes Of the GOS On GDPS (hereafter termed Expert Team). The first meeting of the Expert Team was held at Meteo-France, Toulouse, France, from 9 to 11 March 2000 (ref. 4). The Expert Team has as a major focus the development of methods to assess the impact of changes in the GOS on the GDPS. In particular, the Expert Team was asked to assess the possibility of using verification statistics as an alternative to conducting the more costly and time-consuming Observing System Experiments (OSEs).

      A significant change in the GOS that provided a case for the Expert Team to examine was the failure of the NOAA-11 TOVS instrument on 26 February 1999. As a consequence of that failure, the number of satellite soundings distributed by the US National Environmental Satellite and Data Information Service (NESDIS) was cut in half. The loss of the TOVS data provided the opportunity to conduct a case study to test the use of verification statistics in lieu of OSEs because the impact of TOVS was well known. The results of the case study were finalized at the March 2000 meeting of the Expert Team. While not conclusive, the results gave support to the possible use of verifications to examine the impact of the loss of the Russian Federation RAOBs. In fact, the reduction in the Russian RAOBS was more typical of the kinds of changes in the GOS that prompted the formation of the Expert Team than was the loss of the NOAA-11 TOVS instrument.

      Initially, the terms of reference for the Expert Team were to provide two deliverables:

      1. A report on the possibility of using NWP verification statistics to assess the impact (positive or negative) of changes to the GOS on the operation of the GDPS, particularly NWP, and

      2. A set of guidelines to be used where action is required to minimize the impact of a loss of observations on the operation of the GDPS.

      The guidelines requested were prepared during the March 2000 meeting and became section 4.0 of the final report. (See Annex 3 for the Expert Team comments on those guidelines.)

      The Expert Team agreed that it would undertake a case study to examine the possible impact of the loss of Russian Federation RAOBs on NWP forecasts, and to provide a report with final recommendations for the next session of CBS in November 2000. It was felt that the degree of loss of RAOB data over Russia would maximize any chance of detecting impacts using routine verification statistics. If no signal were to be detected, then consideration would have to be given to using alternative techniques such as an OSE.

      The membership of the Expert Team and affiliates participating in the study is shown in Annex 1.

    3. Possible Studies to Quantify Impact
    4. The Expert Team discussed several possibilities for examining and quantifying the possible impact. The two deemed to be the most viable were Observing System Experiments (OSEs), and a case study using the verification statistics routinely exchanged between GDPS centers. While the use of verification statistics was adopted as the methodology, and incorporated into the working hypothesis, it is important to consider the relative values of both possibilities.

      1. Observing System Experiments (OSEs)
      2. OSEs are a relatively routine way to examine the impact on (and hence value to) NWP analyses and forecasts of various observing systems and observational data sets. OSEs are also known as "data denial" or "data sensitivity" tests because they address situations where data normally assimilated into the analyses are withdrawn. Routine, because they have been used by most, if not all, of the major NWP centers for several decades. (See Annex 2 for additional comments on OSEs)

        A main disadvantage of using OSEs as opposed to verification statistics is that they are resource intensive, both human and computer. Although, in some instances, an infrastructure has been established as a framework for conducting an OSE, which allows results to be obtained with a relatively smaller amount of effort and time. Even then, that effort and time required is still significant. The relative effort required to conduct an OSE as opposed to using verification statistics was a primary motivating factor in using verifications to examine the issue of data impact.

      3. Verification Statistics
      4. Forecast verification scores are exchanged routinely between NWP Centres. These verification statistics, such as mean errors and RMS errors, can be calculated as a comparison of the forecasts against observations, or forecasts against analyses, all valid at the same. In either case, verification statistics are available for temperature, height and vector wind, at standard levels, for forecast periods out to at least T+72 hours. Some verification statistics are available for longer forecast periods, for example, out to T+168 hours

        In addition to the mean and RMS errors, verification statistics in the form of Anomaly Correlations can be calculated. The mean is subtracted independently from both the forecast and verifying analysis, and the auto-correlations calculated between the two residual fields.

      5. Factors to Consider

A number of factors were identified by the Expert Team in their March 2000 report that could impact the results of either OSEs or the use of routine verification statistics.

  1. The changes in observations must be significant in geographical or vertical coverage, and/or in volume for a signal to be detected. Conclusive results usually cannot be obtained when the change in observations is small, or the period of verification is short.
  2. Natural changes in variability (i.e., predictability) can substantially affect the verification scores. The El Nino-Southern Oscillation is one such example.
  3. The performance of any NWP system is affected by factors such as changes at any specific centre in the data assimilation, analysis and prediction components. Of course, these components differ from center to center, so the results can be highly system dependent.
  4. Models do not remain static, and did not during the period covered by the study.
  5. The NWP centers differ in the types and quantity of observations available and/or used.
  6. There are differences in computing the verification scores.
  7. Case studies of impact should consider both verification against observations, and verification against analyses. For the latter case, the analyses should be those of the particular center to ensure that the centre’s forecast remains coupled with that center’s depiction of truth. Further, verification against analyses may not be meaningful for regions where there are few observations, because the model forecast is essentially verifying against itself.
    1. Expectations from Earlier Studies
    2. Table 1 summarizes the impact from different observation types over the northern and southern hemisphere extra-tropics and the tropics. The value (hours) given for each observation type represents results from a number of recent studies as reported at the 2nd COSNA Coordinating Group (CGC) and WMO Workshop on the Impact of Various Observing Systems on NWP, Toulouse, 6-8 March 2000 (ref. 5). The results are given in terms of maximum gain in the large-scale forecast skill at short and medium range. The gain represents the change in skill when observations from a particular system are added to all the other observations routinely used in the data assimilation. The number of observing systems used routinely varies from centre to centre. The more advanced assimilation systems use more of the advanced observing types. This may reduce the relative impact of a new observation type (e.g., scatterometer data which appear with a smaller impact in the table below, than in earlier versions of the table).

      The table is meant only to be a rough guide. The magnitudes of impact depend, for example, on the model and assimilation scheme used and the variable being examined (e.g., temperature). Generalization of the values, therefore, must be treated with caution. It is clear, however, that some observation types (e.g., sondes and TOVS) stand out as being especially valuable.

      Table 1: CURRENT CONTRIBUTIONS OF SOME PARTS OF THE EXISTING OBSERVING SYSTEMS TO THE LARGE-SCALE FORECAST SKILL AT SHORT AND MEDIUM RANGE

      Platform

      Northern Hemisphere

      Tropics

      Southern Hemisphere

      (A) TOVS

      12 (a)

      12

      24

      CMW

      6

      6 (c)

      6

      SCAT

      Neutral to a few hours

      Neutral to a few hours

      6

      SSM/I

      Neutral to a few hours

      Neutral to a few hours

      6

      SONDES

      24

      12

      6

      AIRCRAFT

      6 (b)

      Neutral to a few hours

      Neutral to a few hours

      BUOYS

      Neutral to a few hours

      Neutral to a few hours

      6

      (a) The (A) TOVS impact has been found to be much larger when the radiances are assimilated directly rather than using retrieved profiles that simulate a RAOB.

      (b) Locally, the impact can be much larger than the value given in the table as that value represents a mean over the entire region. The main reason is that the data are not distributed evenly throughout any one region due to factors such as the distribution of clouds and land vs. ocean.

      (c) To judge the impact of cloud motion winds (CMW) in the tropics, verifications of forecasts against observations are needed, as the analyses are strongly affected by the use or omission of CMW. The CMW impact is highly variable with the geographical area and in the vertical, especially in the tropics.

      Given the results in table 1, there is every expectation that the loss of large numbers of RAOBS in one area would have an impact on the NWP analyses and forecasts. A priori, one might expect that impact would be reflected in the verification statistics. A posteriori, as will be shown, that was not the case for this study.

    3. Working Hypothesis
    4. With the discussion in the previous sections and the March 2000 report as background, the Expert Team adopted the following working hypothesis:

      "It is possible to establish a meaningful impact of reductions in the radiosonde network in Russia through evaluation of readily available verification scores of global and/or regional scale operational forecast models"

      "Meaningful" implies finding a statistically and meteorologically significant signal above the noise expected from variability in skill as a function of, for example, seasonal trends, circulation regime, predictability considerations, changes in data assimilation systems, model changes, and general data access, availability, and quality control issues.

    5. Expert Team Approach

Because of the time and resources needed for an OSE, the Expert Team at its March 2000 meeting had agreed that the first step would be to determine if the use of verification statistics were viable as a technique to determine the impact on NWP of lost Russian Federation RAOB data. If it were found to be viable, then the technique could be applied to other situations in the GOS were data had been lost, or were about to be lost. If not, then other techniques would have to be considered.

The Expert Team agreed to examine a limited number of fields as the first test of the hypothesis. These were organized in two sets of data analyses:

1. Verification Against Sondes: The forecast values at various verifying times (e.g., T+48, T+72, and T+168) were compared to the RAOB observations at the same verifying times.

2. Verification Against NWP Analyses: The forecast values at various verifying times (e.g., T+48, T+72, and T+168) were compared to the NWP analyses at the same verifying times.

- For both sets of data analyses above, the following guidelines were used:

- The period examined was from January 1994 to December 1999,

- RMS and mean errors of 500 hPa heights and 250 hPa vector winds,

- The areas of concern were Asia, North America, and Europe/North Africa,

- For both 00 and 12 UTC, and

- For forecast periods of T+24, T+72, T+120 and T+168.

Each center participating provided the information extracted from its operational verification databases. These are the data routinely exchanged over the GTS. The participating centers were JMA (Japan), CMC (Canada), NCEP (U.S.), The Met. Office, European Center for Medium-range Weather Forecasting (ECMWF, and Meteo France.

Each center’s data were circulated amongst the participants who provided comments. The final report is based on a consensus of the comments from the Expert Team.

It is important to point out that, given the time constraints, the concept of "meaningful impact", was not explored fully. The examination of the statistics was constrained to the subjective appraisal of the available data. No systematic stratification of data (e.g., by circulation regime) was done, nor were quantitative measures of statistical significance or signal/noise ratios calculated. This may very well have reduced significantly the chances for finding evidence of "meaningful impact" in support of the hypothesis. Nevertheless, it was believed that the effort would be valuable regardless of the result, if for no other reason than to determine what cannot be done to examine impact.

  1. DATA ANALYSIS AND RESULTS
    1. Available Data Sets
      1. Availability of Russian Federation RAOBS
      2. Both ECMWF and NCEP (see Figs. 1, 2 and 3) provided graphical distributions of the RAOB stations whose data were being received. As noted at the beginning of the report, The UK and NCEP statistics both showed that the number of stations decreased by between 65% and 70%, although each covered a slightly different region.

        Meteo France provided tabular information for 1999 on the availability for 3DVAR data assimilation of RAOB data from the Russian Federation (combined for European and Asian Russia). At 00UTC, the number of stations providing data ranged between 51(September) and 74 (April). At 12UTC, the number ranged from 23 (August) to 35 (April). During 1999, Meteo France found a general decrease, with an increase after October 1999. Data from about 35 more stations were being assimilated into the 3DVAR in April 2000 than had been in October 1999.

        Figure 3 shows the geographical distribution between 60E and 130E of stations received at ECMWF for January 2000. Fig. 4 shows the frequency of receipt at ECMWF for the period 1995-2000. In January 1995 at 00 UTC, ECMWF found 69 stations active, with 60 reporting more than 23 times during the month. By December 1999, those numbers had dropped to 26 active stations, with 23 reporting more than 23 times during the month. Thus, ECMWF also reported a drop in the range of 65% to 70%.

         

         

         

         

         

         

         

         

         

         

         

         

         

         

        ECMWF and NCEP had nearly identical tallies, both being lower overall than the others. Both The Met. Office and Meteo France are in general agreement. The Expert Team did not think the differences in availability were significant, and did not examine them closely. Meteo France and The Met. Office, however, included more of the European area of the Federation than did NCEP. The largest gaps appeared in the Asian portion, thus giving NCEP the lower absolute tally. All, however, showed a dramatic loss of RAOB stations over the five-year period that exceeded 65%.

      3. Verification Statistics

      CBS verification statistics were provided by, JMA, CMC, NCEP, The Met. Office, and ECMWF. Figs. 5 and 6 are examples of a combined set of verification statistics provided by The Met. Office, in this case for 500 hPa heights. The y-axis represents the RMS errors in meters. Figs. 7 and 8 are similar statistics for 850 hPa temperatures (Kelvin).

       

       

       

       

      Fig. 7 Verification statistics WMO/CBS Asia list: RMS errors of predictions vs. observations for 850 hPa temperatures at T+24 for main Northern Hemisphere Centres 1994-2000

       

      Fig. 8 Verification statistics WMO/CBS Asia list: RMS errors of predictions vs. observations for 850 hPa temperature at T+120 for main Northern Hemisphere Centres 1994-2000

      .

      Table 2 shows a summary of the kinds of verification statistics available for use in the study. The table is not exhaustive as there were too many different combinations of fields (wind and temperature), for time periods out to T+168, for different regions (Asia, North America, and Europe) to be exhaustive. The table does indicate, however, the extent of the verification statistics that were used, and the range of models and verification techniques included in the study.

       

      Table 2: Summary of Verification Statistics Available

      Country

      Summary

      UK

      2/day plots from 1996.

      Verification vs. sondes: mean & RMS for height, T, Wind; for any standard pressure level; for any CBS area. Correlation statistics using sondes are not available.

      Verification vs. Analyses: Same as above.

      US

      Daily plots from 1997. Verifications for NCEP MRF model for 500 hPa, for October-March for North America and Europe. Monthly mean scores for Jan., Feb., and Mar., 1996-1998.

      Canada

      Daily scores from 1997 for N. America at 00 UTC and 12 UTC, for 24, 48, and 72-hour forecasts for 500 and 250 hPa. Proposed full set of statistics to be exchanged rather than plots. Excel spread sheet sent to all of verification scores 24-48-72 hr RMS for CMC global model since 1 Jan 1994 over the N. American area. Also, 24-48 scores for the CMC regional model. Provided tendency correlation scores.

      France

      Daily scores for global ARPEGE model for 2.5 and 1.5 degree resolution vs. RAOBS over any WMO domain; ARPEGE initialized analyses; and ECMWF 4DVAR fields over Europe N of 20 deg N.

      Japan

      RSME of 500 hPa for JMA Global model in Jan 1999 verified against sonde data. RSME for 500 hPa for Asia, N. America, and Europe mean errors for periods of T+24, T+48 and T+72, March 1996 to June 2000.

      ECMWF

      For each year, 1995-1999, height and wind verifications (observations vs. first guess) by RAOB station, indicating the number of observations in the data sample. Also an ASCII file on the reporting statistics for each station.

      Australia

      Provided an analysis of JMA tabular data. Computed mean monthly values and prepared time series of the data.

    2. Analysis of Verification Scores
      1. Concept
      2. 1. Verification Against Sondes: The forecast values at various verifying times (e.g., T+48, T+72, and T+168) were compared to the RAOB observations at the same verifying times.

        2. Verification Against NWP Analyses: The forecast values at various verifying times (e.g., T+48, T+72, and T+168) were compared to the NWP analyses at the same verifying times.

        For both sets of data analyses above, the following guidelines were used:

        - The period examined was from January 1994 to December 1999,

        RMS and mean errors of 500 hPa heights and 250 hPa vector winds,

        - The areas of concern were Asia, North America, and Europe/North Africa,

        - For both 00 and 12 UTC, and

        - For forecast periods of T+24, T+72, T+120 and T+168.

      3. Verifications Against Sondes
      4. Fig. 9 shows the JMA verification scores in the JMA regional model (48N, 92E; 55N, 170E; 14N, 112E; 18N, 150E) as a time series plotted against the availability at ECMWF of Russian RAOBS at 00 UT for the period January 1995 to May 2000. The scores are verifications vs. sondes, and are shown in RMS errors of the height (meters) of the 500-hPa surfaces.

        The Regional Spectral Model (RSM) region from JMA provides a verification area close to and mainly downstream of the study area. This is not an ideal "downstream" area but prior OSEs suggest that there should be a larger impact in this region than further downstream.

         

         

         

        The deterioration in the number of the RAOBS is clearly shown, but there is no corresponding deterioration in the scores. Fig. 9 echoes the composite results of verifications vs. sonde for all the centers as shown in Figs. 5-8. By contrast, Fig. 5 shows some improvement in the scores (decrease in error) at T+24 for all the centers. Because the study dealt with global models, this decrease at the shorter time period is most likely to be due to many factors other than the loss of the Russian RAOBS. With the possible exception of the area near Japan, and impact would be expected to be at the longer forecast periods. At T+120, however, there appears to be no detectable signal, and the slight changes are considered to be within the noise level (see Figs. 6 and 8).

        The time series (Fig. 9) shows a general downward trend in the T+24 hour errors since 1996. However, the longer forecast intervals do not show such a clear improvement. In fact, the "winter" errors for T+72, 120 and 168 hours show higher errors in 1998/99 and 1999/2000. This is particularly the case for 1998/99 in the T+72 and T+120 hour forecasts. For instance, the 1998/99 T+72 hours show 7 consecutive months with an error exceeding 50 m and is a significant deterioration from the previous two winter seasons.

        There are significant variations in scores between centers at any given time. Yet, the shape of the time series in Figs. 5-8 (and Fig. 9) for each center is essentially the same, although the absolute values do vary. The graphical results shown in Figs. 5-8, while a sample, are typical of all the verification scores examined.

        Fig. 10 shows the monthly average RMS verification scores for the 500 hPa geopotential height predictions from the ECMWF system, verified against sondes over the North American region for T+48, T+72 and T+120 hours. As is typical with most verification scores, interpretation of any changes in verifications is complicated by major system changes. In the case of ECMWF, a major change to the analysis scheme for 4DVAR was made in November 1997. There have been other changes since then, particularly in the analysis of satellite radiances.

        In the period since this change, however, the errors in the T+72 and T+120 hour forecasts for the winter of 1999/1998 and 1999/2000 are higher than those for 19/1998, where the peak values are larger by almost 10 meters. Also notable is that the error for T+120 hours for the Northern summer of 1999 is higher than any in any year going back to 1995. The significance of these statistics is not clear. The statistics are suggestive, therefore, but not conclusive. The interpretation of these statistics also typifies the problem of trying to interpret the significance of any changes among the many contributing factors, including analysis and model changes, as well as natural variability.

        Team members did note a few instances of deterioration in the verification scores. For example, some of the T+72 and T+120 RMS and RMSE vector wind results from The Met. Office show a slight deterioration in forecast performance for 1998 and 1999. France also had a slight decrease over the same period. The statements are heavily qualified, however, as changes in such things as predictability due to the assimilation of data could be responsible. Meteo France, for example, had difficulties during this period in the assimilation of satellite winds.

        Considering the sum of all of the results, the Expert Team concludes that it was not possible to correlate the reduction in the Russian Federation RAOBS with a deterioration in the verification scores when the forecast is compared with observations at the verification time. Therefore, the hypothesis is not proven for the case of verifications vs. sondes.

        This result is true for each of the centers participating and for most forecast periods, especially those out to T+120 and T+168. Some possible reasons why such a correlation was not found when there was a significant decrease in the number of RAOB sites are discussed in a subsequent section.

      5. Verifications Against NWP Analyses
      6. Figs. 11 and 12 show the scores for verifications vs. analyses from The Met. Office for the second half of 1996 and 1999. (Note should be made that the Y-axis scales are different for the two figures. They run in increments of 10 for 1996, from 0 - 200 meters. For 1999, they are in increments of 5 from 20 - 140 meters.) The inter weekly variations in the T+ 144 scores at 500 hPa for the Northern Hemisphere are about 80-90 meters for November and December 1996, and about 50-60 meters for the same period in 1999.

         

         

         

         

         

         

         

         

        Figures 13 and 14 show similar statistics for the RMS vector wind error (verifications vs. analyses) at 250 hPa for the same periods in 1996 and 1999. (Note the Y-axis scale for 1996

        ranges from 15-30 m/s in 1 m/s increments. The Y-axis for 1999 ranges from 10-35 m/s in 1 m/s increments.) There are larger peak values in the 1999 scores, while the RMS errors at the beginning of 1999 are less than those at the beginning of 1996. These two figures suggest that there has been no clear deterioration between 1996 and 1999 despite the large loss of RAOB stations.

        Another quantity widely used to verify forecasts is the Anomaly Correlation (AC). NCEP has provided AC scores for the 500-hPa level for the winter months 1996, 1997 and 1998 for both the T+120 and T+168 forecast periods. (See Figs. 15 and 16.) For T+120, the three years track closely; except for the large negative spike in February 1996 before the RAOB stations were closed. For T+168, the period between mid-January and mid-March 1998 is anomalously high, but this may be due to the better predictability of the 1997-1998 El Nino event.

        As with the verifications vs. sondes, the Expert Team concludes that it was not possible to correlate the reduction in the Russian Federation RAOBS with any deterioration in the verification scores when the forecast is compared with analyses at the verification time. Therefore, the hypothesis is not proven for the case of verifications vs. analyses.

         

         

         

      7. Impact on Analyses

    Using ECMWF observation increment statistics (i.e., the difference between the observation and the first guess field), the Expert Team examined geopotential height and vector wind first guess errors with respect to the several inland stations which remained after the decrease, and which had become isolated (see Table 3). Station 28445 (Verhnoe Dubrovo) in western Russia is included as a control as it had several stations nearby, and is downwind of the dense network over Europe.

    While table 3 contains only a small selection of stations, the statistics generally show that the observational increments (first guess minus observation at specific stations) are higher in the latter years than in 1995. There was a relatively low number of RAOBS in 1996, and that year shows relatively high increments in general. These statistics indicate that the accuracy of the first guess at the remaining stations has decreased. One consequence is that those remaining observations have a larger impact on the analyses. Hence, those remaining stations have become more critical to protect from further reductions. Further, in the absence of substitute observations (the Expert Team assumed that there were none), the quality of the first guess between the remaining stations has been adversely affected as well. This suggests, in turn, that the short-term forecasts should have been affected, but the Expert Team did not have any relevant model statistics to verify this view.

    Table 3: Average Annual 500-100 hPa Increments for Geopotential Height and Vector Wind Differences (Selected Stations)

    Year

    Time (UTC)

    24959 Jakutsk

    62 01N, 129 43E

    24125 Olenek

    68 30N, 112 26E

    20674Ostrov Dickson

    73 30N 80 24E

       

    Z (m)

    Wind (m/s)

    Z

    Wind

    Z

    Wind

    1995

    00

    17.2

    3.6

    20.2

    3.7

    24.1

    3.5

     

    12

    15.7

    3.5

    19.5

    3.7

    26.0

    3.6

    1996

    00

    18.6

    3.8

    17.6

    4.0

    37.7

    4.1

     

    12

    19.1

    3.7

    19.4

    4.1

    39.3

    4.0

    1997

    00

    19.0

    3.9

    21.1

    4.1

    34.0

    4.7

     

    12

    16.1

    3.8

    19.3

    4.3

    -

    -

    1998

    00

    18.8

    3.7

    22.8

    4.2

    30.0

    4.2

     

    12

    19.1

    4.1

    20.2

    4.0

    29.9

    4.2

    1999

    00

    -

    -

    -

    -

    -

    -

     

    12

    20.0

    4.3

    30.2

    4.0

    41.1

    4.6

    Year

    Time (UTC)

    23330 Salehard

    66 32N, 66 40E

    28445 Verhnow Dubrovo

    56 44N, 61 04E

    Z (m)

    Wind (m/s)

    Z

    Wind

    1995

    00

    18.9

    3.9

    16.9

    3.9

    12

    16.8

    4.0

    19.1

    4.2

    1996

    00

    19.4

    3.8

    17.8

    4.1

    12

    19.6

    3.7

    -

    -

    1997

    00

    21.6

    4.2

    17.0

    4.4

    12

    21.4

    3.9

    -

    -

    1998

    00

    20.8

    4.1

    16.2

    4.0

    12

    30.2

    3.8

    -

    -

    1999

    00

    21.0

    3.9

    16.8

    4.2

    12

    -

    -

    -

    -

    (Table 3 continued)

     

    3.2.5 Daily Variability

    Graphs of daily verification statistics have been provided from CMC, JMA, and Met Office. These provide a different view of any potential impact than would the CBS verification statistics.

    CMC

    Fig. 17 shows the Daily RMS errors for the CMC global model over North America for T+24 and T+48.

    CMC further provided the peak values for RMS error for 500-hPa geopotential over North America. In the regional model at T +48 hours, these are higher for the winter of 1999/2000 than in the previous three years, except for a single spike in 1997/98 (see Fig. 18). In the global model at T+72 hours, the peak errors in winter 1999/2000 are generally the highest since 1995, again apart from a single spike in 1996/7. 1998/99 winter figures are also generally higher than 1997/98. Similar comments apply to the tendency correlation at T +48 and T+72 hours (but not T+24 hours).

     

     

     

     

     

     

     

     

     

     

     

     

    JMA

    Similar comments can be made based on the daily statistics for the CBS Asian region and RSM region supplied by JMA. Although both the mean and peak errors decrease with time in the T+24 hour forecasts, the peak errors for T+72, T+120 and T+168 hours show higher values in the later years. Comparable graphs for the northern hemisphere supplied by The Met. Office, however, do not show this pattern.

  2. Summary
  3. 1. Given system changes and perceived natural variations in predictability it is difficult to detect easily any clear signal in the CBS verification statistics, and even more difficult to unequivocally assign any change to a particular cause. That does not mean, however, there is no signal. Over regions such as Asia and North America there are some indications of degradation in the CBS scores. Further, there is no clear sign of improvement as might have been expected from other observation, analysis scheme and model improvements. An improvement should have been expected, although natural variability is a major factor to consider as well.

    One should not be totally surprised by these results given the earlier difficulty of detecting clearly the impact caused by the demise of NOAA-11 through the use of verification scores.

    2. The lack of impact could mean several things:

    a. The Russian Federation has managed the change in the network very well for the purposes of the large-scale NWP. There was no signal to be found because the minimum number of RAOBS needed on the synoptic scale was preserved. (Regional models were not investigated).

    b. The verification statistics were not sensitive to the change because other signals were stronger.

    c. The methodology in analyzing the information was not correct, or incomplete. It is important to point out that, given the time constraints, the concept of "meaningful impact", was not explored fully. The examination of the statistics was constrained to the subjective appraisal of the available data. No systematic stratification of data (e.g., by circulation regime) was done, nor were quantitative measures of statistical significance or signal/noise ratios calculated. This may very well have reduced significantly the chances for finding evidence of "meaningful impact" in support of the hypothesis. Nevertheless, it was believed that the effort was valuable regardless of the result, if for no other reason than to determine what cannot be done to examine impact.

    d. Changes in the model physics, data assimilation, and/or mix of observing systems masked the impact of the loss of the RAOB stations. As noted above, the sum of these changes should have resulted in a decrease in the errors. Such a decrease, however, was not generally noted.

    e. The verification regimes may not have been well suited or best located to detect any impact. The region of most data loss was mainly north of 50N, whereas the regions considered in the study included a broad latitude range (e.g., Asia, North America) or were centered at lower latitudes (e.g., the JMA RSM region).

    3. The large differences in sample size (i.e., number and coverage of RAOB stations) between 1996 and 1999 may have affected the scores. Eliminating the data from stations in regions of high variability may affect the scores, regardless of the reason why the data were not used in the assimilation (e.g., station closure or failure to pass quality control checks).

    4. A number of the stations closed had the lowest quality measurements, and may have been having a negative impact on the verifications. Eliminating those stations could have raised the verification scores.

    5. Examining verification against analyses using the CBS defined statistics are only computed for the near-hemispheric scale or the broad tropical band. The use of regional models and verifications against analyses over smaller areas may have shown a different picture, but information was not available to examine this aspect.

    6. Verification against sondes in the CBS statistics for the Asian region may have been affected by changes in the list of stations included in the verification. The Expert Team did not investigate this aspect.

    7. Despite not finding a clear signal in the statistics, some degradation was noted. For example, some of the T+72 and T+120 RMS and RMSE vector wind results from The Met.Office show a slight deterioration in forecast performance for 1998 and 1999. France also reported a slight decrease over the same period. The statements are heavily qualified, however, as changes in such factors as predictability due to the assimilation of data could be responsible. Meteo France, for example, had difficulties during this period in the assimilation of satellite winds.

    8. There are further indications of degradation in verification statistics additional to those exchanged within the CBS system, such as the peak errors ("poor forecasts") registered in the daily values for the Asia and North America regions, and in a verification area closer to the study area. Again, the results are indicative but not conclusive.

    9. The impact of the reduced network is clearly evident, however, in observation increment statistics for the remaining stations. These show that the quality of the first guess fields has deteriorated over the period examined. In the absence of substitute observations, the analyses between the remaining stations also have deteriorated.

  4. Conclusions
  1. The hypothesis (see section 2.4) was not proved in this study for either the case of verifications vs sondes, or verifications vs analyses. Neither was the hypothesis disproved. It is not useful, in the opinion of the Expert Team, to attempt to use a subjective and cursory analysis of CBS verification statistics to examine the impact that the loss of observational data might have on NWP, when these changes are confined geographically or occur over a limited period. The interactions of the elements that contribute to an impact, as well as the factors that influence one’s ability to discern that impact, are much too complex to be examined by this simplistic approach.
  2. The importance of the GOS in NWP performance is not in question. The main question is how one determines the impact of a particular set of observations or system. The GOS has already developed to a point where there are several observing systems providing similar data. Such apparent redundancy and the many factors that affect NWP performance make it difficult to find a signal associated with any one system. This is especially true when the studies are subjective and cover a relatively short period of time.
  3. If routine verification scores are, in fact, to be useful in assessing data impacts, the Expert Team concludes that a much more rigorous approach must be used. Such an approach should include, as a minimum, such things as the systematic stratification of data (e.g., by circulation regime), and the calculation of quantitative measures, including statistical significance and signal-to-noise ratios. Further, a more specific focus on particular synoptic situations or severe weather events probably will provide a more valuable measure of impact rather than verification statistics averaged over large areas and extended periods. The development along these lines of the capability to conduct impact studies based on verification statistics would require putting in place a suitable infrastructure at a designated center or centers. A substantial effort would be required to define the scientific and analytical processes required to build that infrastructure. The Team believes that there is considerable expectation that such an effort would be productive, and it recommends that the matter be pursued further.
  4. The Expert Team recognized that the use of verification statistics, even if found to be useful and feasible, will not answer many of the questions concerning impact of observing systems on the GDPS. OSEs have been used successfully in the past and the Expert Team concludes that they have a definite role to play in future impact studies. The Expert Team, therefore, endorses the NCEP plan to conduct an OSE using its Reanalysis system and data set to test further the hypothesis that the loss of the RAOB data from the Russian Federation does have an impact on the quality of the NWP analyses and forecasts.
  5. The Expert Team further concludes that mechanisms need to be established for more timely assessment of changes in the GOS that might affect the quality of the NWP analyses and forecasts. This requires a proactive approach wherein procedures, infrastructure, and resources are identified and in place to address issues as they arise.
  6. Finally, the Open Program Area Group on Data Processing and Forecasting Systems is invited to review the results of the pending NCEP-OSE, and as appropriate in light of the results;
    1. Reconsider the potential for using the verification scores in assessing the impact of changes in the GOS, and
    2.  

    3. Consider recommending any changes in the verification procedures that may assist in the use of the verification scores for assessing impact.

Annex 1

EXPERT TEAM MEMBERS AND AFFILIATES PARTICIPATING IN THE STUDY

 

Expert Team Members

AUSTRALIA Mr. Terry Hart, (Chair)

Bureau of Meteorology

GPO Box 1289K

MELBOURNE VIC 3001

Australia

Tel: (613) 9669 4030

Fax: (613) 9662 1222

Email: t.hart@bom.gov.au

CANADA Mr. Gilles Verner

Canadian Meteorological Centre

2121 Trans_Canada Highway

DORVAL QUEBEC

Canada H9P 1J3

Tel: (1 514) 421 4624

Fax: (1 514) 421 4657

Email: gilles.verner@ec.gc.ca

FRANCE Mr. Bruno Lacroix

Meteo-France

42 Avenue Coriolis

31057 TOULOUSE CEDEX

France

Tel: (33 5) 6107 8270

Fax: (33 5) 6107 8209

Email: bruno.lacroix@meteo.fr

JAPAN Mr. Nobutaka Mannoji

Japan Meteorological Agency

1-3-4 Otemachi Chiyoda-ku

TOKYO 100

Japan

Tel: (813) 3212 8341 ext. 3320

Fax: (813) 3211 8407

Email: nmannoji@npd.kishou.go.jp

UNITED KINGDOM Mr. Mike Bader

The Met. Office

London Road, Bracknell

BERKSHIRE RG12 2SZ

United Kingdom

Tel: (44 1344) 856 424

Fax: (44 1344) 854 026

Email: mjbader@meto.gov.uk

Mr. Richard Dumelow

The Met. Office

London Road, Bracknell

BERKSHIRE RG12 2SZ

United Kingdom

Tel: (44 1344) 854 489

Email: rdumelow@meto.gov.uk

USA Dr Steve Tracton

National Weather Service NOAA

5200 Auth Road

CAMP SPRINGS MD 20446

USA

Tel: (1 301) 763 8000 ext 7222

Fax: (1 301) 763 8545

Email: steve.tracton@noaa.gov

ECMWF Mr. Antonio Garcia-Mendez

ECMWF

Shinfield Park

READING BERKSHIRE RG2 9AX

United Kingdom

Tel: (44 118) 949 9424

Fax: (44 118) 986 9450

Email: a.garcia@ecmwf.int

Affiliates Participating In The Study

Mr. Simon Fuller

The Met. Office

London Road, Bracknell

BERKSHIRE RG12 2SZ

United Kingdom

Tel: (44) (0) 1344 856446

Email: cfver@meteo.gov.uk

Dr. David Forrester

The Met. Office

London Road, Bracknell

BERKSHIRE RG12 2SZ

United Kingdom

Email: daforrester@meteo.gov.uk

Mr. Morrison Malki

Chief, Data-processing System Division

World Weather Watch Department

World Meteorological Organization

Tel: (4122 73 08 231)

Fax: (4122 73 08 021

Email: mmalki@www.wmo.ch

Mr. James Giraytys

WMO Consultant

301 Longview Lane

Winchester, VA 22602

USA

Tel: + 540-678-8633

Email: giraytys@shentel.net

 

Annex 2

FURTHER COMMENTS ON OBSERVING SYSTEM EXPERIMENTS (OSES)

OSEs are a relatively routine way to examine the impact on (and hence value to) NWP analyses and forecasts of various observing systems and observational data sets. OSEs are also known as "data denial" or "data sensitivity" tests because they address situations where data normally assimilated into the analyses are withdrawn.

Using operational data assimilation techniques and operational models, data from specific systems and/or observational sites are systematically removed from the data assimilation in a series of post-operational "runs". Each OSE run includes at least the data assimilation, first guess, and analysis phases. It may also include the forecast production phase as well. The results of each run are compared with each other, and to the operational analyses and forecasts. These latter are used as the "control" run.

Although OSEs have been used for a number of years with success, there are some important constraints on the design. Some factors to consider include, the hypothesis to be tested by the OSE must be carefully drawn to isolate specific issues. Observing system configurations, or scenarios, need to reflect the issues identified in the hypothesis. The data sets, models and assimilation techniques need to be matched. The criteria for selecting the data to be removed need to be stated carefully to reduce unintentional errors or anomalies. Similarly, the criteria for evaluating the results need to be carefully drawn to provide a systematic basis for analyzing the results. As a general statement, OSEs should always address scenarios of reduced observations where the "observation increments" are large enough to result in a signal that can be extracted from the intrinsic noise of the data assimilation system.

Recognizing the constraints, agreed guidelines have been prepared on how to organize and conduct an OSE, and on recommended techniques for analysis of the results. It is possible to examine both analyses and forecasts, or one and not the other. The conduct of a valid OSE can be labor and computer intensive. Even when the process and infrastructure for organizing an OSE has been standardized at a particular center, the computer time required to conduct one OSE run is the same as the time required for an operational run. Several OSE runs with different data removed multiply the time required, most often on the operational computer that may have only small amounts of time available for such studies. The results, often with large volumes of statistics and graphics, normally require some amount of manual interpretation.

The conduct of a valid OSE, then, requires human and computer resources that are not inconsequential.

 

Annex 3

COMMENTS ON EXPERT TEAM MARCH 2000 REPORT

 

1. Notification of Changes

The Expert Team reviewed the recommendations (sections 3.21 to 3.23) of its March 2000 meeting and iterated the need for provision of timely details of major changes to the NWP system including:

(1) Date of change,

(2) Description of change, and

(3) Impact of change on verifications from parallel testing.

2. Extra Data Monitoring

Each monitoring centre should endeavor to provide monitoring information along the lines of ECMWF. The Lead Centres, therefore, should extend their monitoring procedures to include information about regional variations in data volumes. Such information could include counts by:

(1) WMO block number,

(2) Latitude/longitude box, and

(3) Individual platforms.

These counts should be compared with long-term averages to indicate areas of declining reports. For each type of observation, the Lead Centres should consult with other centers and agree on the exact procedures for monitoring, information exchange and alerting Members and WMO about problems.

3. Results of Verification Study

The Expert Team noted that irrespective of the outcome of the verification case study, it is recommended that (on the basis of monitoring results):

(1) Each relevant centre, Member and the WMO Secretariat should implement the guidelines of actions required to minimize the impact of a loss of observations on the operations of the GDPS.

(2) Members should develop procedures for notifying users in the event that changes to the GOS result in significant degradation of the NWP products.

 

LIST OF ACRONYMS

AC Anomaly Correlations

CBS/OPAG/IOS Commission for Basic Systems Open Area Program Group on the Integrated Observing System

CGC Coordinating Group for COSNA

CMC Canadian Meteorological Center

COSNA Composite Observing System for the North Atlantic

ECMWF European Center for Medium Range Weather Forecasts

GDPS Global Data Processing System

GOS Global Observing System

GTS Global Telecommunications System

JMA Japanese Meteorological Agency

NCEP National Centers for Environmental Prediction (United States)

NESDIS National Environmental Satellite and Data Information Service (United States)

NOAA National Oceanic and Atmospheric Administration (United States)

NWP Numerical Weather Prediction

OSE Observing System Experiment

OSSE Observing System Simulation Experiment

RAOB Radiosonde Observation

RMS Root Mean Square

RMSE Root Mean Square Error

RSM Regional Spectral Model (Japan)

The Met. Office The Meteorological Office, United Kingdom

TIROS Television Infra-red Observing Satellite

TOVS TIROS Operational Vertical Sounder

 

 

REFERENCES

(1) Final Report, CBS-EXT(98), Karlsruhe, Germany, 30 September to 9 October 1998.

(2) Final Report, Expert Team Meeting on Observational Data Requirements and Redesign of the Global Observing System, Geneva, Switzerland, 29 November to 3 December 1999.

(3) Final Report, 9th Session, CBS Working Group on Data Processing, Geneva, Switzerland, 10-14 November 1997.

(4) Final Report, Expert Team to Evaluate the Impact of Changes of the GOS on GDPS, Toulouse, France, 9-11 March 2000.

 

(5) Final Report, 2nd COSNA Coordinating Group (CGC) and WMO Workshop on the Impact of Various Observing Systems on NWP, Toulouse, France, 6-8 March 2000.

 

 

 

 

 

WMO Front, About WMO, WWW Front, Library, International Weather