Skip to content

WMO Instruments Part 2.2: File Handlers#3

Open
sfinkens wants to merge 22 commits into
wmo-instruments-part1from
wmo-instruments-part2-file-handlers
Open

WMO Instruments Part 2.2: File Handlers#3
sfinkens wants to merge 22 commits into
wmo-instruments-part1from
wmo-instruments-part2-file-handlers

Conversation

@sfinkens
Copy link
Copy Markdown
Owner

@sfinkens sfinkens commented May 19, 2026

Update file handlers to provide WMO instrument names. To be merged into pytroll#3390. This is the biggest PR of the series.

Summary

  • Replace sensor dataset attribute with WMO-compliant instruments in file handlers.
  • Change sensors: [mysensor] -> instruments: [mysensor] in reader YAML files
  • Update YAML reader to support both sensors and instruments, with a deprecation warning for sensors
  • Update satpy_cf_nc reader to support both instruments and sensor, with a deprecation warning for sensor.

Breaking Changes

Removing the sensor attribute is currently backwards incompatible. In one of the follow-up PRs I will add a config switch for getting the legacy sensor attribute back. If enabled, scene.__getitem__ would populate sensor with the internal instrument name.

Notes

  • I added more explicit instrument sets in some places
    • Ocean Color CCI now uses: {"SeaWiFS", "MERIS", "MODIS", "VIIRS"} instead of merged. Source: PML documentation
    • geocat now uses IMAGER (GOES 12-15) instead of goes_imager. The source attribute is just goes, so there's no way to distinguish between 8-11 and 12-15.
  • Some readers define a sensor property but never use it. I still changed those to WMO names.
  • Some readers define a sensor property that is not an instrument such as mimic. I didn't change those.
  • Closes #xxxx
  • Tests added
  • Fully documented
  • Add your name to AUTHORS.md if not there already

@sfinkens
Copy link
Copy Markdown
Owner Author

Please review if you have time @djhoese and @martin

@djhoese
Copy link
Copy Markdown

djhoese commented May 19, 2026

Wow, this is a lot of work. Nice job. I have not looked at the code and only your description so far.

  1. The enum sounds good, but does scare me because of it needing to be updated to add support for new sensors, but it sounds like for the most part you worked around that by just using strings for the instruments to keep it simple?
  2. Regarding sensor properties in some readers, I wonder if they were trying to have a sensor_names property that is then used by the base reader here.
  3. Would it be worth it to retain hyphens (-) in the serialized instrument name to improve readability and keep some backwards compatibility?
  4. I don't remember what our long term goal was for composites and the sensor visir/<sensor> idea. I mean, I think technically it can be a long tree of dependencies like visir/some_category/some_property/abi and it would load visir.yaml, some_category.yaml, and some_property.yaml to get "depedencies". We've also talked about and have issues about flattening the composite YAMLs and just always loading all of them...I think. Wait, why does the YAML need to use the WMO name anyway? The filenames are already using the serialized version so why can't/shouldn't the attributes in those files use it?

@sfinkens
Copy link
Copy Markdown
Owner Author

  1. I think it would be nice to update the Enum with new instruments, but it's not necessary. But you can also just use strings.
  2. I don't know. Sometimes it's just an instance attribute not a property. We could of course add more sensor_names properties in a follow-up PR.
  3. Good idea
  4. I like the idea that all YAMLs contain normalized names. I think that needs only two adjustments:
    • Normalize WMO names in scene.sensor_names where dataset and YAML attributes are mixed
    • Translate normalized name back to WMO name, where file handlers add YAML contents to dataset attributes.

@djhoese
Copy link
Copy Markdown

djhoese commented May 20, 2026

I like the idea that all YAMLs contain normalized names. I think that needs only two adjustments:

Normalize WMO names in scene.sensor_names where dataset and YAML attributes are mixed
Translate normalized name back to WMO name, where file handlers add YAML contents to dataset attributes.

Ok: normalize == WMO, serialize == satpy-name. Can we use this wording from now on?

I think it makes sense for scene.sensor_names to be WMO names since it is user facing. Everything internal that is doing "YAML stuff" would use the serialized names. I think the readers and the sensor names in the YAML of the readers can use WMO names since those are more like metadata rather than being "for computers/satpy" to use. Or is that too confusing?

@sfinkens
Copy link
Copy Markdown
Owner Author

Ok: normalize == WMO, serialize == satpy-name. Can we use this wording from now on?

Haha I was thinking

normalize = satpy-name
serialize = satpyname1-satpyname2

Or how would you call the latter?

@sfinkens
Copy link
Copy Markdown
Owner Author

sfinkens commented May 20, 2026

I think it makes sense for scene.sensor_names to be WMO names since it is user facing

Agreed

Or is that too confusing?

I think it would be simpler if they were serialized as well

@djhoese
Copy link
Copy Markdown

djhoese commented May 20, 2026

I think it would be simpler if they were serialized as well

But then we have to have a serialized->WMO conversion which is not obvious or consistent or even possible, right?

normalize = satpy-name
serialize = satpyname1-satpyname2

But then what do you call the -> WMO process?

Edit: What are those 2 cases? I thought there was satpy internal usage and WMO/human usage. So abi for satpy internal and ABI for WMO.

@sfinkens
Copy link
Copy Markdown
Owner Author

sfinkens commented May 20, 2026

But then we have to have a serialized->WMO conversion

Given the serialized name, the WMO name could be looked up in a dictionary

INTERNAL_TO_WMO: dict[str, str] = {
    wmo_to_internal(instrument): str(instrument)
    for instrument in OSCAR
}

where OSCAR is that Enum with all the instrument names.

Edit: What are those 2 cases?

Both are internal, the first one is single instrument, the second one is for multiple instruments

But then what do you call the -> WMO process?

Good question 😁 denormalize? Maybe the method names should be more self describing?

wmo_to_internal()
internal_to_wmo()
join_instrument_names()

@djhoese
Copy link
Copy Markdown

djhoese commented May 20, 2026

As mentioned on slack, maybe you're write and using internal versus wmo is better than normalize/denormalize/serialize. The internal to wmo dictionary makes sense although I hope there aren't any collisions (multiple WMO names that map to the same internal name). Or WMO names that aren't unique enough to distinguish themselves for our internal purposes. I think AGRI was one of those cases where AGRI on FY-4A is a completely different instrument from AGRI on FY-4B.

Edit: Yeah OSCAR considers them the same entity: https://space.oscar.wmo.int/instruments/view/agri

@sfinkens
Copy link
Copy Markdown
Owner Author

sfinkens commented May 20, 2026

From what I've seen OSCAR takes that into account: "AGRI", "AGRI (FY-4C)", "IMAGER (GOES 8-11)", "IMAGER (INSAT)" etc

@djhoese
Copy link
Copy Markdown

djhoese commented May 20, 2026

WMO or OSCAR? Where do you see that naming/usage?

@sfinkens
Copy link
Copy Markdown
Owner Author

When I type "AGRI" in the OSCAR search bar I get two suggestions. But you're right, they don't distinguish between FY-4A/B

@coveralls
Copy link
Copy Markdown

coveralls commented May 21, 2026

Coverage Report for CI Build 26290847897

Coverage increased (+0.01%) to 96.407%

Details

  • Coverage increased (+0.01%) from the base build.
  • Patch coverage: No coverable lines changed in this PR.
  • 431 coverage regressions across 47 files.

Uncovered Changes

No uncovered changes found.

Coverage Regressions

431 previously-covered lines in 47 files lost coverage.

Top 10 Files by Coverage Loss Lines Losing Coverage Coverage
readers/nwcsaf_nc.py 33 80.5%
readers/mirs.py 30 88.34%
readers/ghrsst_l3c_sst.py 29 0.0%
readers/viirs_compact.py 24 77.57%
readers/core/landsat.py 23 91.3%
readers/eps_l1b.py 21 85.47%
readers/maia.py 21 22.09%
readers/seviri_l1b_nc.py 19 75.12%
readers/aapp_l1b.py 18 90.07%
readers/core/yaml_reader.py 16 97.54%

Coverage Stats

Coverage Status
Relevant Lines: 59171
Covered Lines: 57045
Line Coverage: 96.41%
Coverage Strength: 2.89 hits per line

💛 - Coveralls

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants