Anatomy of a BUFR4 message¶
In order to understand the mapping between a CSV file and its BUFR encoding it is first helpful to understand the anatomy of a BUFR message. A BUFR message is encoded in binary and contains 6 sections as shown in the diagram below. The ecCodes keys for the different elements are also shown as these are used to set certain elements (highlighted in red) as part of the conversion to BUFR. The non-highlighted elements (including those without an ecCodes key) are either set by the eccodes module based on the data, have a default value or can be omitted / set to missing.
The information cotained in Sections 0, 1, and 3 are essentially metadata specifying:
the version of the BUFR tables used (editionNumber, masterTableNumber, masterTableVersionNumber);
where the data have come from (bufrHeaderCentre, bufrHeaderSubCentre);
the typical time of the observation (typicalYear … typicalSecond);
the type of data and what parameters are included (dataCategory, internationalDataSubCategory and unexpandedDescriptors).
These are all specified in the BUFR template mapping file.
The edition number and master table number should be 4 and 0 respectively.
BUFR edition 4 is the latest version and whilst it is possible to define new BUFR tables for different application
areas only Master Table 0 (MT0) has been defined (meteorology).
The tables in MT0 are updated approximately every 6 months as part of the WMO fast track process and the authoritative
source can be found in the WMO Manual on Codes, Volume I.2.
Updates to this can be delayed and the latest version, including machine readable tables, can be found at
https://github.com/wmo-im/BUFR4/releases and it is recommended to use the the most recent release. The master
branch
contains the current working copy of the tables and is subject to change.
The BUFR header centre and bufr header sub centre are specified in Common Code Tables C-11 and C-12 respectively. The typical time of observation (typicalYear … typicalSecond) should be determined based on the data to be encoded. Within the csv2bufr module and CLI only a single observation / weather report is encoded per file and so these should be set to those columns in the CSV specifying the year, month, day etc. More information is provided in the page on the BUFR template mapping (link to follow).
The data category should be set according to BUFR Table A, i.e. 0 for “Surface data - land” and 1 for “Surface data - sea”. The international data sub category should be set according to Common Code Table C-13. The unexpanded descriptors specifies the data to be encoded in the data section is comprised of a list of the BUFR descriptors. These descriptors are detailed further on the next page (BUFR descriptors).