EXPERT TEAM ON INTEGRATED DATA MANAGEMENT
FINAL REPORTSHINFIELD PARK, READING, 13 - 16 MAY 2002
Recommendations of working groups shall have no status within the Organization until they have been approved by the responsible constituent body. In the case of joint working groups the recommendations must be concurred with by the presidents of the constituent bodies concerned before being submitted to the designated constituent body.
In the case of a recommendation made by a working group between sessions of the responsible constituent body, either in a session of a working group or by correspondence, the president of the body may, as an exceptional measure, approve the recommendation on behalf of the constituent body when the matter is, in his opinion, urgent, and does not appear to imply new obligations for Members. He may then submit this recommendation for adoption by the Executive Council or to the President of the Organization for action in accordance with Regulation 9(5).
|List of participants|
ORGANIZATION OF THE MEETING
1.1 Opening remarks
1.2 Adoption of the agenda
1.3 Working arrangements
REVIEW WMO CORE METADATA PROFILE DEVELOPED AT THE FIRST MEETING
DEFINE EXTENSIONS NEEDED FOR ISO CODE LISTS
FINALISE XML SCHEMA, REPRESENTATION AND EXAMPLES FOR THE WMO CORE METADATA PROFILE
FINALISE LIST OF KEYWORDS TO DESCRIBE WMO DATASETS
DEVELOP RECOMMENDATIONS ON COMPREHENSIVE WMO METADATA STANDARDS
DISCUSS REQUIRED MODIFICATIONS TO THE GUIDE ON WWW DATA MANAGEMENT
FUTURE WORK PROGRAMME
CLOSURE OF THE MEETING
The second meeting of the CBS Expert Team on Integrated Data Management was held 13 to 16 May 2002 at the Met Office Training College in Shinfield Park, Reading, UK.
The team finalized the proposal for a "WMO Core Metadata" profile within the context of the ISO Standard for Geographic Metadata (ISO 19115). This core provides a general definition for directory searches and exchange that should be applicable to a wide variety of WMO datasets. It does not specify how these metadata should be archived or presented to users and does not specify any particular implementation.
The core elements define a minimum set of information required to exchange data for WMO purposes and are not exhaustive. To fully meet the requirements of all WMO Programmes for metadata, application of far more comprehensive standards would be required. The team felt that development of a comprehensive WMO metadata standard would be a difficult, lengthy and expensive undertaking and the potential benefits of a such a standard would be very limited and would not justify the large commitment of resources that would be required. It suggested that each WMO Programme use the WMO Core Metadata as a starting point to develop more detailed metadata standards in response to its own requirements. These more-detailed programme-specific standards should, to the extent possible, be based on the ISO standard with any necessary extensions. Reliance on the ISO standard as a common starting point would reduce the effort required by the Programmes and would greatly enhance the compatibility between the various Programme-specific standards and with the WMO Core Metadata standard.
At its first meeting the team noted that all of the WMO core items could be accommodated within the draft ISO standard but that some WMO extensions to the ISO code lists might be required. Upon further examination the team determined that a few minor extensions were needed in order for the ISO code lists to meet WMO requirements and the proposed extensions are described in report.
There are many possible ways of representing WMO metadata and the team recommended that XML be adopted as the common language (or format) for exchange. To ensure interoperability, the experts developed a framework, as an XML Schema, for mapping the proposed metadata standard into XML.
The team also developed a number of examples to illustrate some implementations of the standard. The examples are provided in structured text format as well as XML.
The team carefully reviewed the existing WMO Guide on WWW Data Management and discussed the requirements for its revision as well as the effort that would be required to bring it up to date. It determined that some sections of the Guide are seriously out of date and in major need of revision. The team felt that the Guide was probably not worth the effort required to bring it up to date and keep it up to date as an entire package. Instead, the experts recommended that the Guide be considered primarily as an on-line document with updates applied chapter by chapter as requirements and advances in technology dictate. Furthermore, the chapters concerned with the most rapidly changing fields, such as computer graphics should be removed or replaced with references to existing on-line authorities on these topics.
1. ORGANIZATION OF THE MEETING (agenda item 1)
1.1 Opening remarks
1.1.1 The second meeting of the CBS Expert Team on Integrated Data Management was held 13 to 16 May 2002 at the Met Office Training College in Shinfield Park, Reading, UK. Since Mr S. Foreman(UK), chair of the expert team was not able to attend the meeting, Mr Gil Ross(UK) chaired the meeting. He also opened the meeting and welcomed the participants to the Training College and the Reading area.
1.2 Adoption of the agenda
1.2.1 The experts adopted the agenda as reproduced at the beginning of this report.
2. REVIEW WMO CORE METADATA PROFILE DEVELOPED AT THE FIRST MEETING
2.1 Recalling discussions from their first meeting, the expert team, noted that metadata means different things to different people . In general, it is the descriptive data necessary to allow us to find, process and use data, information and products. While metadata generally can describe products, services and software as well as data at different stages of manipulation, it can also be a specification. Metadata can be extensive and all-inclusive, or it can be specific to a more limited function.
2.2 WMO Programmes and Members currently maintain a tremendous volume and variety of metadata. However little of this is in a standardized form which could be used to find data, so called discovery-level metadata.
2.3 The team has been tasked to provide a metadata framework which at the highest level is applicable to all WMO tasks, but which can be extended to specific areas and to new ventures in an acceptable and standardized form. This would not be a metadata repository of the all-inclusive form, but the basis of a description which can be extended by users to cover their own unique applications. Thus, the team concentrated on a "core" set of metadata that could fulfil the requirements for discovery-level metadata while allowing for expansion and extension to meet more specific requirements.
2.4 At its first meeting the team agreed that ISO Geographic Metadata Standard (ISO 19115) provided the best framework for development of such as standard. It specifies a process where a community can adopt parts of the standard that it feels are relevant (including the "Core Elements") and also extend the elements, keywords and code table instances to suit that community. The team agreed that there should be a Community Core Profile which could be adopted by all of WMO, with the potential for further extensions under ISO 19115 Annex C where necessary. With this process in mind the team developed a "WMO Core Metadata" profile, which is described in the report of the first meeting.
2.5 The team noted that the WMO Core Metadata provides a general definition for directory searches and exchange that should be applicable to a wide variety of WMO datasets. It does not specify how these metadata should be archived or presented to users and does not specify any particular implementation.
2.6 The core elements listed define a minimum set of information required to exchange data for WMO purposes and are not exhaustive. To fully meet the requirements of all WMO Programmes for metadata, application of far more comprehensive standards would be required, as noted in section 6 below.
2.7 The team reviewed the WMO Core Metadata profile and suggested a few minor modifications. Most-importantly, the team improved the presentation of the WMO Core Metadata profile to make it easier to understand. The latest version of the WMO Core Metadata profile is provided in the annex to this paragraph.
3. EXTENSIONS NEEDED FOR ISO CODE LISTS
3.1 At its first meeting the team noted that all of the WMO core items could be accommodated within the draft ISO standard but that some WMO extensions to the ISO code lists might be required. Upon further examination the team determined that a few extensions were indeed needed in order for the ISO code lists to meet WMO requirements. The proposed extensions are described in the annex to this paragraph.
4. XML SCHEMA, REPRESENTATION AND EXAMPLES FOR THE WMO CORE METADATA PROFILE
4.1 XML is rapidly becoming a standard for exchanging information between applications, as well as for providing information on which the formatting of data for display in a browser may be defined. Industry standards are being defined to allow the exchange of information between applications using the XML standard, with the expectation that many business transactions will use XML as their standard means of data exchange.
4.2 There are many possible ways of representing WMO metadata in XML. To ensure interoperability the experts developed a framework, as an XML Schema, for mapping the proposed metadata standard into XML. The team developed a proposed XML schema and accompanying code list for the WMO Core Metadata as given in the annex to this paragraph.
4.3 It should be noted that, although the team recommends that XML be used as the language (or format) for exchange of the WMO Core Metadata, the standard itself is quite general and does NOT depend upon XML for its implementation.
4.4 The team also developed a number of examples to illustrate some implementations of the standard, which are given in the annex to this paragraph. Examples are provided in structured text format as well as XML.
4.5 The expert team, considering how the WMO Core Metadata standard could be implemented, was pleased to note a presentation of the EUMETNET Programme UNIDART (Uniform Data Request Interface). The main goal of the UNIDART project is the development of a web-based information system that allows uniform and integrated access to heterogeneous and distributed data sources, storing meteorological data and products. The UNIDART system could be seen as a broker that provides a request/reply facility to its users. The figure below shows the principal architecture of the system. Further information about the project can be found at http://www.dwd.de/UNIDART.
4.6 In order to connect users to data providers, the data providers must agree to a standard format for the exchange of answers to the user requests. The UNIDART project will consider the WMO Core Metadata profile that has been developed by the team as a candidate for such an standard exchange format.
4.7 The expert team agreed that the UNIDART project provides a good opportunity for a trial implementation of the WMO Core Metadata, which could provide valuable feedback to its further refinement. The experts hoped that the project team of UNIDART would keep the ET informed concerning the progress of the project and its experiences while implementing and using the standard.
Report of the ETDR&C
4.8 The ET considered the recent report from Charles Sanders, Australia, to the meeting of the Expert Team on Data Representation and Codes (Prague 22-25 April 2002), and the recommendations and observations made by the ETDR&C in their report. The Team concurred with the ETDR&C recommendations, and wished to make some additional observations and suggestions.
4.9 The team observed that although the set of XML protocols and standards were still evolving, a considerable amount of utility and functionality had already developed. There were many tools and toolkits on the market to support XML technologies supplied both from Open Source and proprietary software vendors.
4.10 There were a large number of applied XML languages for specific purposes and the paper by Mr Sanders listed a number of meteorological variants. WMO throughout its history has developed protocols, codes and procedures for the interchange of meteorological data. These protocols, codes and procedures are metadata and their development and maintenance are primary functions of WMO. The team strongly agreed with the ETDR&C that WMO must not allow control of meteorological metadata standards to become fragmented or to become the subject of rival formats and conventions, or in the worst case, perhaps to become the subject of commercial patents.
4.11 The team emphatically supported the arguments of the ETDR&C that internationalisation of the codes should remain a commitment of WMO. There are mechanisms in XML which can be used to permit multiple language versions (even with multiple character sets) of XML tags. The ET observed that while for documents, the internationalization of the tags is a minor part of the task, for data representation, the tags give identification of the element content and the ability to transform the language is very important.
4.12 Much is made of XML being "human-readable". In practice humans do not read XML, computers do. XML is intended to be processed by applications which render the content of the XML elements in human-readable form. After all, humans do not normally read HTML tags in documents, the tags are suppressed and only control how the content is displayed. For XML markup of data, as described above, the identification of the element content is crucial.
4.13 The utility of XML is to group, describe, identify and structure, or "markup", parts of documents and data. XML Schemas describe shared vocabularies and allow computers to carry out rules that people have defined in the schema. XML Schemas are one mechanism to describe markup tags and the allowable syntax and data types in an XML document - the rules. Where we talk about an XML vocabulary, we mean a specific set of tags and structures defined by an XML Schema.
4.14 The approach to XML vocabularies that the team has taken is that there should be a cascade, a hierarchy of vocabularies, of which the discovery vocabulary and code tables are at the top. Other discovery vocabularies, which are extensions of the discovery metadata will incorporate or "inherit" the Core Schema terms. Product vocabularies which include the metadata will also refer to the Core schema as well as the schemas developed for the product. The XML schema for the WMO Core Community Profile discovery metadata already works this way, in that the vocabulary references the code-list vocabulary.
4.15 In principle, discovery metadata should be completed as far as is possible. Optional items should be included whenever they are relevant. For those items which are fairly static, (e.g. organization address), there is a mechanism in XML which allows an XML document to contain a pointer to another XML document, and this second document could include the basic organization particulars.
Uses of meteorological XML data
4.16 The Team agreed with the ETDR&C that fully XML coded observations are unlikely to replace WMO codes for international exchange. Instead, they will be used to generate a variety of products that make use of different vocabularies for markup of XML documents to control how these data are processed and displayed.
4.17 XML is not used alone. Currently most browsers will only display an XML file as raw text, or will try to interpret the tags as HTML and will display nothing - because XML cannot be directly interpreted as HTML. However there are languages such as XSLT (eXtensible Stylesheet Language Transform) which with a minimum of instructions will transform the XML document into, for example an HTML or XHTML file which can be displayed. However many current browsers can be configured with a plug-in to use a default style for an XML document to display it. In future these plug-ins are likely to be incorporated into the next release of the browsers, and the user will not need to do any specific set-up task.
4.18 The applications which interpret XML and carry out the defined processing are not particularly difficult to create. Once created (in XSLT, or Java, say) they are intended to be immediately portable and useable on most other machines. This means that WMO members will be able to reuse code for XML in a very straightforward way.
5. KEYWORDS TO DESCRIBE WMO DATASETS
5.1 To facilitate searches for datasets that will meet a given requirement, the proposed WMO Community Core Metadata Profile provides for keywords that describe the dataset. A standard list of keywords could help to achieve the maximum benefit from this provision and could also contribute to development of multi-lingual capabilities. A draft list of WMO keywords was developed via correspondence. The team reviewed this list and added a number of additional keywords. A copy of the list developed at the meeting is provided in the annex to this paragraph. Please note that since keywords are routinely added to this list, the most recent version is available via the Internet at http://www.wmo.ch/web/www/metadata/WMO-keywords.html.
5.2 Although a standard list of commonly used keywords could facilitate searches, particularly searches extending across datasets described in multiple languages, the group agreed that uncontrolled or open keywords should also be allowed. This would allow data providers to use keywords that may be unique to their own requirements or language without unnecessarily expanding the list of standard WMO keywords.
5.3 Thus two types of keywords will be allowed:
standard WMO keywords, which will be listed on the WMO server in English, French, Spanish and Russian and will be included in the WMO Core Metadata XML Schema
Other keywords, which can be defined by any dataset originator for their own use.
6. RECOMMENDATIONS ON COMPREHENSIVE WMO METADATA STANDARDS
6.1 The various Programmes of WMO have a wide range of requirements for documentation of their datasets. Thus, it would be extremely difficult to develop a comprehensive standard for metadata that would meet the needs of all Programmes.
6.2 The team agreed that development of a comprehensive WMO metadata standard would be a difficult, lengthy and expensive undertaking and the potential benefits of a such a standard would be very limited and would not justify the large commitment of resources that would be required. The team recommended an alternative approach.
6.3 The ET has developed the WMO Community Core Metadata Profile, a subset of the larger and much more comprehensive ISO Metadata standard. It suggested that each WMO Programme use the WMO Core Metadata as a starting point for extension into more detailed metadata standards in response to its own requirements. These more-detailed standards should, to the extent possible, be based on the ISO standard with any necessary extensions. Reliance on the ISO standard as a common starting point would reduce the effort required by the Programmes and would greatly enhance the compatibility between the various Programme-specific standards and with the WMO Core Metadata standard. Furthermore, the team recommended that all WMO Programmes consider using XML as a format for exchanging their metadata.
6.4 The team suggested that consideration be given to translating the WMO BUFR, CREX and GRIB code tables into XML. The UK Met Office was largely successful in demonstrating how to represent the BUFR code tables in XML(see Meteorological Data and XML by Gorman, Kelly, Ryan and Sanders. 2002). It was also suggested that it would be useful to develop a method, perhaps through the use of XML, to link WMO station numbers with station metadata.
7. REQUIRED MODIFICATIONS TO THE GUIDE ON WWW DATA MANAGEMENT
7.1 CBS requested that the team advise on reorganization of the Guide on WWW Data Management and coordinate the development of the WMO Guide on Data Management, including preparation of the sections relating to the WMO metadata standards. The team carefully reviewed the existing Guide and discussed the requirements for its revision as well as the effort that would be required to bring it up to date.
7.2 The ET determined that some sections of the Guide are seriously out of date and in major need of revision. The document was written in the years preceding 1993, and over the past decade, the wider world of Information Technology has taken enormous strides and off-the-shelf solutions now dominate all but the most specialized of WMO processes. While some sections are still relevant, many are obsolete. With so much out of date material, the Guide cannot be updated as it stands. Instead it would require a wholesale redesign.
7.3 The experts discussed the target audience for the Guide, how and where it would be used and how it could be kept up to date while dealing with a field as dynamic as information technology. They noted that new documents have recently been written on some topics of particular interest and relevance to WMO, such as binary representation forms and the use of the Internet and that these have been published as separate documents, outside of the Guide. The team agreed that this is probably the preferred approach to keeping the material in the Guide relevant and up to date. If such material were to be included within a revised Guide, some sections would be in danger of becoming obsolete before the revised Guide were even completed.
7.4 Although the Guide contains some sections that deal with the rapidly changing field of information technology, it also contains material of use as general background information or as an introduction to basic data management concepts. Such basic information does not go out of date quickly.
7.5 The team noted the set of requirements that were defined in the Guide and was impressed that they were well written and quite forward looking. In fact, many of the stated requirements remain worthy goals, more than 10 years after they were written.
7.6 With these considerations in mind the team felt that the Guide was probably not worth the effort required to bring it up to date and keep it up to date as an entire printed package. Some sections could remain relevant for many years, while others become obsolete so quickly that a printed copy is of little use. Instead, the experts recommended that the Guide be considered primarily as an on-line document with updates applied chapter by chapter as requirements and advances in technology dictate. Furthermore, the chapters concerned with the most rapidly changing fields, such as computer graphics should be removed or replaced with references to existing on-line authorities on these topics.
8. FUTURE WORK PROGRAMME
8.1 Having addressed all of the tasks assigned to it by CBS, the only outstanding work is the preparation of a document describing the WMO Core Metadata Standard for the consideration of the Commission at its extraordinary session in December 2002. The team felt that the description defined at the meeting is at a level of detail appropriate for consideration by the Commission. Thus, the preparation of the formal document should be a simple and straightforward task.
9. CLOSURE OF THE MEETING
9.1 The meeting closed on Thursday 16 May 2002.
Annex to Paragraph 2.7 - WMO Core Metadata (pdf)
Annex to paragraph 3.1 - Extensions to ISO Code Lists
Annex to Paragraph 4.2
XML Schema Schematic
XML Code List
Annex to Paragraph 4.4 - Examples
2.GRIB gridded dataset
3.Oceanographic gridded dataset
Annex to Paragraph 5.1 - Keywords for Describing WMO Datasets
LIST OF PARTICIPANTS
|Steve Foreman, Chair||Met Office
Bracknell, Berkshire RG12 2SZ
Tel: (+44 1344) 854680
Fax: (+44 1344) 856099
|Wang Guofu||China Meteorological
46 Zhongguancun Nandajie
Tel: (+86 10) 6840 7485
Fax: (+86 10) 6217 5930
|Agriculture and Agri-Food Canada
Eastern Cereal and Oilseed Research Centre
Ottawa, Ontario K1A 0C6
Tel: (+1 613) 7591524
Fax: (+1 613) 7591924
|Gil Ross||Met Office
Tel: (+44)(0) 1344 856 973
Fax: (+44)(0) 1344 856 119
|Jürgen Seib||Deutscher Wetterdienst
Tel: (+49 69) 80622243
Fax: (+49 69) 80622801
|Dr. Alexander A. Zelenko||Hydrometeorological Research
Center of the Russian Federation
9-13 Bolshoi Predtechensky pereulok
Tel: (+7 095) 255 2227
Fax: (+7 095) 255 1582
|David McGuirk||World Meteorological Organization
7 bis Avenue de la Paix
Case postale No. 2300
CH-1211 Geneva 2
Tel: (+41 22) 730 8241
Fax: (+41 22) 730 8021
1. To be published.