Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Differentiating BUFR template encoding available on the same topic #110

Open
golfvert opened this issue Apr 5, 2024 · 9 comments
Open

Comments

@golfvert
Copy link
Contributor

golfvert commented Apr 5, 2024

Chile recently joined WIS2. They have started publishing synop under origin/a/wis2/cl-meteochile/data/core/weather/surface-based-observations/synop

Under that same TH they have both AWS and human based stations. Those are therefore using two different BUFR template : TM307080 and TM307091.

Do we need / want to inform users in the Notification Message itself that on that topic they will receive synops using two different templates ? Do we conside that this will be handled post-download only ?

If, yes, how to do this ? Using one metadata_id per encoding ? Adding "properties.bufr_encoding" as an additional property ?

So, first point, problem to tackle in WNM yes/no ?
If yes, how ?

@maaikelimper
Copy link

maaikelimper commented Apr 8, 2024

Hi Rémy, I've discussed this issue with @david-i-berry and @tomkralidis as part of our wis2box-development discussion.

IMH, users don't just need the ability to differentiate between BUFR encoding-template.
I think users need a in general a more fine-grained distinction between the different data published on WIS2, than that which can be provided by the WIS2-topic-hierarchy.

I think we should avoid a "wild-west" scenario where different users use different property-types to separate their data.

I would rather recommend that we use the metadata_id and ensure we use a well-defined vocabulary in the metadata-record to allow additional granularity to define a dataset beyond the basic definition of the topic-hierarchy.

Note that in the wis2box users currently ingest data into a directory that is based on the topic-hierarchy. We could alter this to instead allow users to ingest data based on their metadata-id and ensure the metadata-id is added to the data-notification sent by the wis2box.

@golfvert
Copy link
Contributor Author

golfvert commented Apr 8, 2024

The Topic Hierarchy is not meant to be a fine grain distinction mechanism.
I think we have a similar issue with client side filtering. The additional properties, as defined in WNM, allow additional details to be provided on the data announced with the WNM.
I don't think we will be able to specify for all the additional properties allowed. So, in a sense, a "wild-west" is inevitable and, I think, acceptable. And in any case much better than local created additional TH.

@kaiwirt
Copy link

kaiwirt commented Apr 8, 2024

Is this an issue for WIS2 at all? Do we care about the data content?

We have a topic that says here be synoptic observations. And we have a content-type that says this data item is BUFR. In my opinion apart from that decoding of the data is out of scope of WIS2.

@david-i-berry
Copy link
Member

The problem is that under the "synop" topic we have much more than just synoptic observations. We are mixing sub hourly and hourly observations with those observations at the intermediate and main synoptic hours. The list of available / observed parameters is also different from the traditional synoptic observation and depends on the station type and BUFR sequence used. Simply stating that the data are in BUFR format isn't really enough to help the end user.

Making the metadata identifier mandatory would address many of the issues.

@tomkralidis
Copy link
Collaborator

Good discussion. The GDC and its WCMP2 will be the driving component to allow for discovery, evaluation and subscription/access/visualization of data. Yet another reason for properties.metadata_id to be required.

@amilan17
Copy link
Member

If these are separate datasets, each one can have a distinct "data_id" value that is in the WNM.

@tomkralidis
Copy link
Collaborator

properties.data_id is at the level of the data granule. At the dataset level, properties.metadata_id will help delineate accordingly.

@david-i-berry
Copy link
Member

At the moment the choice of template / BUFR sequence to use is left to the data provider without regulation and so currently a free for all. In the WIS2 tech regs do we state that all granules in a dataset shall use the same BUFR sequence? If not this addition may be a way to start to bring some order.

@amilan17
Copy link
Member

In the WIS2 tech regs do we state that all granules in a dataset shall use the same BUFR sequence?

I don't think so. There is some guidance in the WIS2 Guide on how to define the dataset: https://wmo-im.github.io/wis2-guide/guide/wis2-guide-DRAFT.html#_1_1_4_why_are_datasets_so_important

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

No branches or pull requests

6 participants