From ef5df87ded58c0beb757427e503634f99d7c5e34 Mon Sep 17 00:00:00 2001 From: Scot Breitenfeld Date: Fri, 15 May 2026 09:52:38 -0500 Subject: [PATCH 1/2] docs: add canonical name column to registered filter plugins Introduces the canonical filter name as a normative field in the registry, in preparation for HDF5 v3 (H5Z_class3_t) plugins where the H5Z_class_t::name field becomes a load-bearing string identifier rather than a free-form debug comment. - Adds a Canonical Name column to the registry table, pre-filled with proposed names marked (proposed) pending maintainer confirmation. - Adds a "Canonical Names (HDF5 v3 / H5Z_class3_t)" section documenting where the name appears, the syntactic rules ([A-Za-z0-9_.-], <=255 bytes, case-sensitive), the first-registered-wins allocation policy, and the self-namespacing convention for unregistered plugins. - Extends "How to Register HDF5 Filter Plugin" to request a proposed canonical name as part of the submission. - Adds "Updating an existing registration for v3" with the three-step task list for current plugin maintainers. Existing plugin builds are unaffected; canonical names only become load-bearing when a plugin opts into H5Z_class3_t. Maintainer confirmations are tracked in #255. See RFC-HDFG-2026-001 for the v3 design. --- docs/RegisteredFilterPlugins.md | 117 +++++++++++++++++++++----------- 1 file changed, 79 insertions(+), 38 deletions(-) diff --git a/docs/RegisteredFilterPlugins.md b/docs/RegisteredFilterPlugins.md index ad5091f3..1c1da5fd 100644 --- a/docs/RegisteredFilterPlugins.md +++ b/docs/RegisteredFilterPlugins.md @@ -11,6 +11,7 @@ Any member of the HDF5 community can register a plugin for their or a third-part * Maintainer's contact information. Minimum an email address, preferably additional information like personal website, GitHub or social network handles. More ways to contact the responsible maintainer is better. * Filter plugin's respository. * Description of the new plugin including the specifics of the filter parameters (`cd_nelmts` and `cd_values[]`) supported by the plugin. +* A proposed **canonical name** for the filter. This is the short, stable string identifier that an HDF5 v3 (`H5Z_class3_t`) plugin will advertise in its `name` field and that the library will write into the on-disk filter-pipeline message. The canonical name must be non-empty, no more than 255 bytes, drawn from the character class `[A-Za-z0-9_.-]`, and is matched case-sensitively. See [Canonical Names](#canonical-names-hdf5-v3--h5z_class3_t) below for details. (Existing v1/v2 plugins without a canonical name continue to work unchanged; this only becomes load-bearing when a plugin opts into v3.) * Links to any relevant documentation, including the licensing information. Upon receiving a request with the above information, HDF Group will register the new plugin by assigning it a filter plugin _identifier_. The current policy for assigning an identifier is explained below: @@ -24,48 +25,88 @@ Upon receiving a request with the above information, HDF Group will register the ## List of Filter Plugins Registered with The HDF Group -| Plugin Identifier | For Filter | Short Description| -|--------|----------------|---------------------| -|`257` |hzip |hzip compression used in Silo| -|`258` |fpzip |Duplicate of plugin `32014`| -|`305` |LZO |LZO lossless compression used by PyTables| -|`307` |BZIP2 |BZIP2 lossless compression used by PyTables| -|`32000` |LZF |LZF lossless compression used by the h5py package| -|`32001` |BLOSC |Blosc lossless compression used by PyTables| -|`32002` |MAFISC |Modified LZMA compression filter, MAFISC (Multidimensional Adaptive Filtering Improved Scientific data Compression)| -|`32003` |Snappy |Snappy lossless compression| -|`32004` |LZ4 |LZ4 fast lossless compression algorithm| -|`32005` |APAX |Samplify’s APAX Numerical Encoding Technology| -|`32006` |CBF |All imgCIF/CBF compressions and decompressions, including Canonical, Packed, Packed Version 2, Byte Offset and Nibble Offset| -|`32007` |JPEG-XR |Enables images to be compressed/decompressed with JPEG-XR compression| -|`32008` |bitshuffle |Extreme version of shuffle filter that shuffles data at bit level instead of byte level| -|`32009` |SPDP |SPDP fast lossless compression algorithm for single- and double-precision floating-point data| -|`32010` |LPC-Rice |LPC-Rice multi-threaded lossless compression| -|`32011` |CCSDS-123 |ESA CCSDS-123 multi-threaded compression filter| -|`32012` |JPEG-LS |CharLS JPEG-LS multi-threaded compression filter| -|`32013` |zfp |Lossy & lossless compression of floating point and integer datasets to meet rate, accuracy, and/or precision targets.| -|`32014` |fpzip |Fast and Efficient Lossy or Lossless Compressor for Floating-Point Data| -|`32015` |Zstandard |Real-time compression algorithm with wide range of compression / speed trade-off and fast decoder| -|`32016` |B³D |GPU based image compression method developed for light-microscopy applications| -|`32017` |SZ |An error-bounded lossy compressor for scientific floating-point data| -|`32018` |FCIDECOMP |EUMETSAT CharLS compression filter for use with netCDF| -|`32019` |JPEG |Jpeg compression filter| -|`32020` |VBZ |Compression filter for raw dna signal data used by Oxford Nanopore| -|`32021` |FAPEC | Versatile and efficient data compressor supporting many kinds of data and using an outlier-resilient entropy coder| -|`32022` |BitGroom |The BitGroom quantization algorithm| -|`32023` |Granular BitRound (GBR) |The GBR quantization algorithm is a significant improvement to the BitGroom filter| -|`32024` |SZ3 |A modular error-bounded lossy compression framework for scientific datasets| -|`32025` |Delta-Rice |Lossless compression algorithm optimized for digitized analog signals based on delta encoding and rice coding| -|`32026` |BLOSC2 |The recent new-generation version of the Blosc compression library| -|`32027` |FLAC |FLAC audio compression filter in HDF5| -|`32028` |SPERR |SPERR is a lossy scientific (floating-point) data compressor that produces one of the best rate-distortion curves| -|`32029` |TERSE/PROLIX |A lossless and fast compression of the diffraction data| -|`32030` |FFMPEG |A lossy compression filter based on ffmpeg video library| -|`32031` |JPEG2000 | A compression filter for lossy and lossless coding| +The **Canonical Name** column is the string identifier used by HDF5 v3 (`H5Z_class3_t`) plugins. Entries marked _(proposed)_ are pending confirmation by the plugin maintainer; see [Canonical Names](#canonical-names-hdf5-v3--h5z_class3_t) and the [tracking issue](https://github.com/HDFGroup/hdf5_plugins/issues/255). + +| Plugin Identifier | Canonical Name | For Filter | Short Description| +|--------|--------|----------------|---------------------| +|`257` |`hzip` _(proposed)_ |hzip |hzip compression used in Silo| +|`258` |`fpzip-legacy` _(proposed)_ |fpzip |Duplicate of plugin `32014`| +|`305` |`lzo` _(proposed)_ |LZO |LZO lossless compression used by PyTables| +|`307` |`bzip2` _(proposed)_ |BZIP2 |BZIP2 lossless compression used by PyTables| +|`32000` |`lzf` _(proposed)_ |LZF |LZF lossless compression used by the h5py package| +|`32001` |`blosc` _(proposed)_ |BLOSC |Blosc lossless compression used by PyTables| +|`32002` |`mafisc` _(proposed)_ |MAFISC |Modified LZMA compression filter, MAFISC (Multidimensional Adaptive Filtering Improved Scientific data Compression)| +|`32003` |`snappy` _(proposed)_ |Snappy |Snappy lossless compression| +|`32004` |`lz4` _(proposed)_ |LZ4 |LZ4 fast lossless compression algorithm| +|`32005` |`apax` _(proposed)_ |APAX |Samplify’s APAX Numerical Encoding Technology| +|`32006` |`cbf` _(proposed)_ |CBF |All imgCIF/CBF compressions and decompressions, including Canonical, Packed, Packed Version 2, Byte Offset and Nibble Offset| +|`32007` |`jpeg-xr` _(proposed)_ |JPEG-XR |Enables images to be compressed/decompressed with JPEG-XR compression| +|`32008` |`bitshuffle` _(proposed)_ |bitshuffle |Extreme version of shuffle filter that shuffles data at bit level instead of byte level| +|`32009` |`spdp` _(proposed)_ |SPDP |SPDP fast lossless compression algorithm for single- and double-precision floating-point data| +|`32010` |`lpc-rice` _(proposed)_ |LPC-Rice |LPC-Rice multi-threaded lossless compression| +|`32011` |`ccsds-123` _(proposed)_ |CCSDS-123 |ESA CCSDS-123 multi-threaded compression filter| +|`32012` |`jpeg-ls` _(proposed)_ |JPEG-LS |CharLS JPEG-LS multi-threaded compression filter| +|`32013` |`zfp` _(proposed)_ |zfp |Lossy & lossless compression of floating point and integer datasets to meet rate, accuracy, and/or precision targets.| +|`32014` |`fpzip` _(proposed)_ |fpzip |Fast and Efficient Lossy or Lossless Compressor for Floating-Point Data| +|`32015` |`zstd` _(proposed)_ |Zstandard |Real-time compression algorithm with wide range of compression / speed trade-off and fast decoder| +|`32016` |`b3d` _(proposed)_ |B³D |GPU based image compression method developed for light-microscopy applications| +|`32017` |`sz` _(proposed)_ |SZ |An error-bounded lossy compressor for scientific floating-point data| +|`32018` |`fcidecomp` _(proposed)_ |FCIDECOMP |EUMETSAT CharLS compression filter for use with netCDF| +|`32019` |`jpeg` _(proposed)_ |JPEG |Jpeg compression filter| +|`32020` |`vbz` _(proposed)_ |VBZ |Compression filter for raw dna signal data used by Oxford Nanopore| +|`32021` |`fapec` _(proposed)_ |FAPEC | Versatile and efficient data compressor supporting many kinds of data and using an outlier-resilient entropy coder| +|`32022` |`bitgroom` _(proposed)_ |BitGroom |The BitGroom quantization algorithm| +|`32023` |`granular-bitround` _(proposed)_ |Granular BitRound (GBR) |The GBR quantization algorithm is a significant improvement to the BitGroom filter| +|`32024` |`sz3` _(proposed)_ |SZ3 |A modular error-bounded lossy compression framework for scientific datasets| +|`32025` |`delta-rice` _(proposed)_ |Delta-Rice |Lossless compression algorithm optimized for digitized analog signals based on delta encoding and rice coding| +|`32026` |`blosc2` _(proposed)_ |BLOSC2 |The recent new-generation version of the Blosc compression library| +|`32027` |`flac` _(proposed)_ |FLAC |FLAC audio compression filter in HDF5| +|`32028` |`sperr` _(proposed)_ |SPERR |SPERR is a lossy scientific (floating-point) data compressor that produces one of the best rate-distortion curves| +|`32029` |`terse-prolix` _(proposed)_ |TERSE/PROLIX |A lossless and fast compression of the diffraction data| +|`32030` |`ffmpeg` _(proposed)_ |FFMPEG |A lossy compression filter based on ffmpeg video library| +|`32031` |`jpeg2000` _(proposed)_ |JPEG2000 | A compression filter for lossy and lossless coding| > [!NOTE] > Please contact the maintainer of a filter plugin for help with the plugin or its filter in the HDF5 library. +## Canonical Names (HDF5 v3 / `H5Z_class3_t`) + +Starting with the v3 plugin class (`H5Z_class3_t`) introduced in HDF5 2.x, every registered filter has a **canonical name**: a short, stable string identifier used by callers, tools, and the on-disk format. + +### Where the canonical name appears + +* **In the plugin source.** A v3 plugin sets `H5Z_class3_t::name` to its canonical name. +* **In the file.** When a v3-aware library adds a filter to a dataset's pipeline, it writes the canonical name into the filter-pipeline message (`H5O_PLINE`), so tools like `h5dump` can identify the filter even when the plugin shared library is not installed. +* **In the API.** The forthcoming `H5Pappend_filter("canonical_name", "params")` interface accepts the canonical name to identify the filter, and accepts a TOML inline table parameter string for typed configuration. (Numeric filter IDs continue to work via the existing `H5Pset_filter` interface.) +* **In CLI tools.** `h5repack`, `h5dump`, and other tools display the canonical name when printing filter information. + +### Syntactic rules + +* Non-NULL, non-empty. +* At most 255 bytes. +* Drawn from the character class `[A-Za-z0-9_.-]`. +* Matched case-sensitively. + +Registrations that violate these rules are rejected by `H5Zregister()` with `H5E_BADVALUE`. + +### Allocation + +The HDF Group assigns a canonical name alongside the numeric `id` at filter-registration time. First-registered wins, exactly as for numeric IDs: if two plugins claim the same canonical name in one process, the second `H5Zregister()` call is rejected with `H5E_CANTREGISTER`. Coordination through this registry is the only guarantee that, for example, `"zfp"` means the same plugin on every system. + +### Self-namespacing (non-normative) + +Third-party plugins that have not gone through HDF Group filter registration should self-namespace their canonical names (for example, `org.example.fastlz`) to make accidental collisions vanishingly unlikely. Coordinated registration through The HDF Group remains the recommended path for plugins that want a short, bare name. + +### Updating an existing registration for v3 + +Existing filter plugins continue to work unchanged with HDF5 v3-aware libraries; the canonical name only becomes load-bearing when a plugin opts into the new `H5Z_class3_t` class. Maintainers of currently registered plugins should: + +1. Review the proposed canonical name in the table above for your filter ID. +2. Confirm or propose a correction by replying on the [tracking issue](https://github.com/HDFGroup/hdf5_plugins/issues/255) or contacting the HDF Group [Helpdesk](mailto:help@hdfgroup.org). Confirmations replace _(proposed)_ in the table. +3. When shipping a v3 build of the plugin, set `H5Z_class3_t::name` to the registered canonical name byte-identically. + +See the HDF5 RFC on string-configured filters (RFC-HDFG-2026-001) for the full v3 design. + ## Information about Registered Filter Plugins ### hzip From 467928416f2743fa7a765b79bae92355453c6352 Mon Sep 17 00:00:00 2001 From: Scot Breitenfeld Date: Fri, 15 May 2026 13:39:09 -0500 Subject: [PATCH 2/2] docs: drop confusing "HDF5 v3" phrasing in favor of H5Z_class3_t "HDF5 v3" reads as "HDF5 library 3.0", but the new plugin class H5Z_class3_t ships in HDF5 2.x. Rename the section heading, the subsection on updating existing registrations, and replace "v3-aware library" / "v3 build" / "HDF5 v3 plugin" wording with direct references to H5Z_class3_t or "HDF5 2.x" as appropriate. The plugin-class version ("v3", as in v1/v2/v3 of H5Z_class_t) is retained only where the versioning of the class itself is the subject. --- docs/RegisteredFilterPlugins.md | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/docs/RegisteredFilterPlugins.md b/docs/RegisteredFilterPlugins.md index 1c1da5fd..8bedfb5f 100644 --- a/docs/RegisteredFilterPlugins.md +++ b/docs/RegisteredFilterPlugins.md @@ -11,7 +11,7 @@ Any member of the HDF5 community can register a plugin for their or a third-part * Maintainer's contact information. Minimum an email address, preferably additional information like personal website, GitHub or social network handles. More ways to contact the responsible maintainer is better. * Filter plugin's respository. * Description of the new plugin including the specifics of the filter parameters (`cd_nelmts` and `cd_values[]`) supported by the plugin. -* A proposed **canonical name** for the filter. This is the short, stable string identifier that an HDF5 v3 (`H5Z_class3_t`) plugin will advertise in its `name` field and that the library will write into the on-disk filter-pipeline message. The canonical name must be non-empty, no more than 255 bytes, drawn from the character class `[A-Za-z0-9_.-]`, and is matched case-sensitively. See [Canonical Names](#canonical-names-hdf5-v3--h5z_class3_t) below for details. (Existing v1/v2 plugins without a canonical name continue to work unchanged; this only becomes load-bearing when a plugin opts into v3.) +* A proposed **canonical name** for the filter. This is the short, stable string identifier that an `H5Z_class3_t` plugin will advertise in its `name` field and that the library will write into the on-disk filter-pipeline message. The canonical name must be non-empty, no more than 255 bytes, drawn from the character class `[A-Za-z0-9_.-]`, and is matched case-sensitively. See [Canonical Names](#canonical-names-h5z_class3_t-plugin-class) below for details. (Existing `H5Z_class1_t`/`H5Z_class2_t` plugins without a canonical name continue to work unchanged; this only becomes load-bearing when a plugin opts into the v3 plugin class.) * Links to any relevant documentation, including the licensing information. Upon receiving a request with the above information, HDF Group will register the new plugin by assigning it a filter plugin _identifier_. The current policy for assigning an identifier is explained below: @@ -25,7 +25,7 @@ Upon receiving a request with the above information, HDF Group will register the ## List of Filter Plugins Registered with The HDF Group -The **Canonical Name** column is the string identifier used by HDF5 v3 (`H5Z_class3_t`) plugins. Entries marked _(proposed)_ are pending confirmation by the plugin maintainer; see [Canonical Names](#canonical-names-hdf5-v3--h5z_class3_t) and the [tracking issue](https://github.com/HDFGroup/hdf5_plugins/issues/255). +The **Canonical Name** column is the string identifier used by `H5Z_class3_t` plugins. Entries marked _(proposed)_ are pending confirmation by the plugin maintainer; see [Canonical Names](#canonical-names-h5z_class3_t-plugin-class) and the [tracking issue](https://github.com/HDFGroup/hdf5_plugins/issues/255). | Plugin Identifier | Canonical Name | For Filter | Short Description| |--------|--------|----------------|---------------------| @@ -69,14 +69,14 @@ The **Canonical Name** column is the string identifier used by HDF5 v3 (`H5Z_cla > [!NOTE] > Please contact the maintainer of a filter plugin for help with the plugin or its filter in the HDF5 library. -## Canonical Names (HDF5 v3 / `H5Z_class3_t`) +## Canonical Names (`H5Z_class3_t` plugin class) -Starting with the v3 plugin class (`H5Z_class3_t`) introduced in HDF5 2.x, every registered filter has a **canonical name**: a short, stable string identifier used by callers, tools, and the on-disk format. +Starting with the v3 filter plugin class (`H5Z_class3_t`), introduced in HDF5 2.x, every registered filter has a **canonical name**: a short, stable string identifier used by callers, tools, and the on-disk format. ### Where the canonical name appears -* **In the plugin source.** A v3 plugin sets `H5Z_class3_t::name` to its canonical name. -* **In the file.** When a v3-aware library adds a filter to a dataset's pipeline, it writes the canonical name into the filter-pipeline message (`H5O_PLINE`), so tools like `h5dump` can identify the filter even when the plugin shared library is not installed. +* **In the plugin source.** An `H5Z_class3_t` plugin sets its `name` to the canonical name. +* **In the file.** When an HDF5 library that supports `H5Z_class3_t` adds a filter to a dataset's pipeline, it writes the canonical name into the filter-pipeline message (`H5O_PLINE`), so tools like `h5dump` can identify the filter even when the plugin shared library is not installed. * **In the API.** The forthcoming `H5Pappend_filter("canonical_name", "params")` interface accepts the canonical name to identify the filter, and accepts a TOML inline table parameter string for typed configuration. (Numeric filter IDs continue to work via the existing `H5Pset_filter` interface.) * **In CLI tools.** `h5repack`, `h5dump`, and other tools display the canonical name when printing filter information. @@ -97,13 +97,13 @@ The HDF Group assigns a canonical name alongside the numeric `id` at filter-regi Third-party plugins that have not gone through HDF Group filter registration should self-namespace their canonical names (for example, `org.example.fastlz`) to make accidental collisions vanishingly unlikely. Coordinated registration through The HDF Group remains the recommended path for plugins that want a short, bare name. -### Updating an existing registration for v3 +### Updating an existing registration for `H5Z_class3_t` -Existing filter plugins continue to work unchanged with HDF5 v3-aware libraries; the canonical name only becomes load-bearing when a plugin opts into the new `H5Z_class3_t` class. Maintainers of currently registered plugins should: +Existing filter plugins continue to work unchanged with HDF5 2.x; the canonical name only becomes load-bearing when a plugin opts into the new `H5Z_class3_t` class. Maintainers of currently registered plugins should: 1. Review the proposed canonical name in the table above for your filter ID. 2. Confirm or propose a correction by replying on the [tracking issue](https://github.com/HDFGroup/hdf5_plugins/issues/255) or contacting the HDF Group [Helpdesk](mailto:help@hdfgroup.org). Confirmations replace _(proposed)_ in the table. -3. When shipping a v3 build of the plugin, set `H5Z_class3_t::name` to the registered canonical name byte-identically. +3. When porting the plugin to `H5Z_class3_t`, set the `name` field to the registered canonical name byte-identically. See the HDF5 RFC on string-configured filters (RFC-HDFG-2026-001) for the full v3 design.