Quick start¶
The csv2bufr Python module contains both a command line interface and an API to convert data stored in a CSV file to the WMO BUFR data format. For example, the command line interface reads in data from a CSV file, converts it to BUFR and writes out the data to the specified directory. e.g.:
csv2bufr data transform <my-csv-file.csv> \
--bufr-template <csv-to-bufr-mapping.json> \
--output <output-directory-path>
This command is explained in more detail below.
Command line interface¶
The following example transforms the data in file my-csv-file.csv
to BUFR using template csv-to-bufr-mapping.json
and writes the output to directory output-directory-path
:
csv2bufr data transform <my-csv-file.csv> \
--bufr-template <csv-to-bufr-mapping.json> \
--output <output-directory-path>
The command is built on the Python Click module and is formed of
three components (csv2bufr data transform
), 1 arguments and 2 mandatory options (specified by –).
The argument specifies the file to process of the data being processed.
The options specify various configuration files to use.
my-csv-file.csv
: argument specifying the CSV data file to process--bufr-template csv-to-bufr-mapping.json
: option followed by the bufr mapping template to use--output-dir output-directory-path
: option followed by output directory to write BUFR file to. The output filename is set using the md5 checksum of the BUFR data to ensure uniqueness, future versions will use the WIGOS ID and timestamp of the data to set the filename.
The output BUFR files can be validated using a tool such as the ECMWF BUFR validator.
Input CSV file¶
Currently, a single station per file is supported with each row treated as a separate record and one BUFR file per record created. The format of the input CSV file has a few requirements:
A comma (i.e.
,
) must be used as the delimiter.Strings must be quoted.
Missing values must be encoded as “None”.
The final row in the file must contain data and not be a new line.
The timestamp of the records must be separated into components, i.e. year, month, day etc must each be in a separate column.
The date/time elements should be in Universal Time Coordinated (UTC).
The file must contain the WIGOS station identifier
WIGOS Station Identifier¶
Each station must have a WIGOS Station Identifier. More information can be found in the Guide to the WMO Integrated Observing System, section 2 (WMO-No. 1165).
BUFR mapping template (--bufr-template
)¶
The mapping from CSV to BUFR is specified in a JSON file (see the BUFR template mapping page).
API¶
The command line interface uses the transform
function from the csv2bufr module. This can be used directly, e.g.:
# import modules
import json
from csv2bufr import transform
# load data from file
with open("my-csv-file.csv") as fh:
data = fh.read()
# load mapping
with open("csv-to-bufr-mapping.json") as fh:
mapping = json.load(fh)
# call transform function
result = transform(data, mapping)
# iterate over items
for item in result:
# get id and phenomenon time to use in output filename
wsid = item["_meta"]["wigos_station_identifier"] # WIGOS station ID
geometry = item["_meta"]["geometry"] # GeoJSON geometry object
timestamp = item["_meta"]["properties"]["datetime"] # phenomenonTime as datetime object
timestamp = timestamp.strftime("%Y%m%dT%H%MZ") # convert to string
# set filename
output_file = f"{wsid}_{timestamp}.bufr4"
# save to file
with open(output_file, "wb") as fh: # note binary write mode
fh.write(item["bufr4"])
The transform
function returns an iterator that can be used to iterate over each line in the data file.
Each item returned contains a dictionary with the following elements:
item["bufr4"]
binary BUFR dataitem["_meta"]
GeoJSON dictionary containing metadata elementsitem["_meta"]["id"]
identifier for result (set combination ofwigos_station_identifier
anddatetime
)item["_meta"]["geometry"]
GeoJSON geometry object of location of dataitem["_meta"]["properties"]
key/value pairs of properties/attributesitem["_meta"]["properties"]["md5"]
the md5 checksum of the encoded BUFR dataitem["_meta"]["properties"]["wigos_station_identifier"]
WIGOS station identifieritem["_meta"]["properties"]["datetime"]
characteristic date of data contained in result (from BUFR)item["_meta"]["properties"]["originating_centre"]
originating centre for data (from BUFR)item["_meta"]["properties"]["data_category"]
data category (from BUFR)