Skip to content

Add BoundLocaleDataProvider and some impls for it#7621

Open
sffc wants to merge 15 commits intounicode-org:mainfrom
sffc:boundlocale
Open

Add BoundLocaleDataProvider and some impls for it#7621
sffc wants to merge 15 commits intounicode-org:mainfrom
sffc:boundlocale

Conversation

@sffc
Copy link
Copy Markdown
Member

@sffc sffc commented Feb 10, 2026

#3260

Please take a look at this design for a formatter that retains a piece of the data provider for runtime lookup of data marker attributes, intended for use cases such as currencies, units, and display names.

Reviewable commit-by-commit.

No AI was used when writing this, but I might use one to fix up the clippy and feature warnings in CI.

The docs are somewhat limited. Please focus on the docs tests for the end-to-end usage examples. I will add more docs before merging.

@sffc sffc requested review from a team and Manishearth as code owners February 10, 2026 05:13
@sffc sffc requested a review from robertbastian February 10, 2026 05:13
@sffc
Copy link
Copy Markdown
Member Author

sffc commented Feb 10, 2026

In case it helps, here's a summary of what you can do after this PR:

You can write code like

let formatter = HelloWorldAttributeFormatter::try_new_with_buffer_provider(
    &provider,
    locale!("en").into()
).unwrap();

assert_writeable_eq!(formatter.format("reverse").unwrap(), "Olleh Dlrow");
assert_writeable_eq!(formatter.format_for_prefix("rev").unwrap(), "Olleh Dlrow");

Please observe that the data marker attribute is given in the format function, not in the constructor.

To make this work, we save a small slice of the data provider structure inside of the HelloWorldAttributeFormatter, which has a generic parameter that varies based on the provider type, powered by some new traits.

This PR adds impls of the new traits in the BlobDataProvider and HelloWorldProvider universes. We will need to add them to baked data, and probably FsDataProvider.

Copy link
Copy Markdown
Member

@robertbastian robertbastian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I remember the display names design, it doesn't involve the formatter holding on to the provider. I don't think this approach gives nice APIs, as the formatters will need to be generic in the providers. This also requires a lot of new traits and types. Overall I'm very negative on this.

Comment thread provider/core/src/buf/serde.rs Outdated
/// - `deserialize_bincode_1`
///
/// ✨ *Enabled with the `serde` Cargo feature.*
pub fn new(inner: &'a P) -> Self {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the canonical way to wrap a provider in a DeserializingBufferProvider is to call .as_deserializing(). we use this in a lot of docs, and I don't think we should introduce an alternative API that does the same

Comment thread provider/blob/src/blob_data_provider.rs Outdated
use alloc::boxed::Box;
use icu_provider::buf::BufferFormat;
use icu_provider::prelude::*;
use icu_provider::unstable::BindLocale;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you start depending on unstable APIs, do we need to use a ~ dep?

/// assert_writeable_eq!(fmt.format("reverse").unwrap(), "Olleh Dlrow");
/// ```
#[derive(Debug)]
pub struct HelloWorldAttributeFormatter<P: BoundLocaleDataProvider<HelloWorldV1>> {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this a bad example, or does this mean every formatter using this pattern needs to be generic in a provider?

@sffc
Copy link
Copy Markdown
Member Author

sffc commented Feb 10, 2026

The use case is when you need to load a list of many different data marker attributes.

If you load each one in sequence from a provider, you need to re-read the locale (and marker in the case of blob provider) for every lookup.

What this design does is it creates an intermediate "bound locale provider" that is halfway through a data load operation. You can then more efficiently perform attribute lookups.

@sffc
Copy link
Copy Markdown
Member Author

sffc commented Feb 18, 2026

See #7663 for benchmark results. @robertbastian

@sffc sffc added discuss-priority Discuss at the next ICU4X meeting and removed discuss-priority Discuss at the next ICU4X meeting labels Feb 23, 2026
@sffc
Copy link
Copy Markdown
Member Author

sffc commented Feb 26, 2026

My preference is to move forward with this design as the "primary" Display Names API. However, @robertbastian does not agree and prefers to make single-display-name the "primary" API. @Manishearth attempted to find compromise, saying that we can start with single-display-name and ask for feedback if multi-display-name is motivated based on user feedback.

I do not find this to be a desirable compromise, because we have what I see as compelling data that multi-display-name is significantly (2x) faster than single-display-name when loading a list of names, which is both the most common use case and also the one that amplifies any performance gains. I would prefer to not implement as the "primary" API one that we know today is slow.

Another argument that was raised was that it is easy to refactor single-display-name into multi-display-name. I consider this half-true. On the one hand, the data markers are the same, and we only need to rewrite the traits and the implementation. On the other hand, the whole user-facing API is subject to change, with all its docs tests, FFI, etc., as well as the implementation. It believe it to be substantially less work to implement multi-display-name out of the barn than to implement single-display-name followed later by a refactor to multi-display-name.

In order to move forward, I intend to write a "minimal" single-display-name API in such a way that I spend most of my time on the things that are needed by both approaches, i.e., focusing on the data model. I will not add FFI or thorough test coverage. I will add only enough docs to make the types usable in a 2.2 experimental release.

This position is not intended to be an endorsement for the single-display-name API, and I have great fear that my implementing it this way will result in the temptation for clients to start using it and for ICU4X to graduate it. I am setting aside these fears in order to make engineering progress.

@Manishearth
Copy link
Copy Markdown
Member

@Manishearth attempted to find compromise, saying that we can start with single-display-name and ask for feedback if multi-display-name is motivated based on user feedback.

(This was not my suggestion. I shall write more later)

@Manishearth Manishearth reopened this Feb 26, 2026
@Manishearth
Copy link
Copy Markdown
Member

Manishearth commented Feb 26, 2026

My suggestion was that you move forward with Option 4 (the iterator API), implemented internally the slow way. We document it as "this is somewhat slow, we hope to optimize it". We could early optimize it if it can be done in a purely internal way.

We then link people to one of the issues here and encourage them to talk to us about their use cases.

This is a good use of the fanfare around experimental new features in release notes. We get to see what people care about, and this can inform API shape and where to spend time optimizing.

Another argument that was raised was that it is easy to refactor single-display-name into multi-display-name. I consider this half-true. On the one hand, the data markers are the same, and we only need to rewrite the traits and the implementation. On the other hand, the whole user-facing API is subject to change, with all its docs tests, FFI, etc., as well as the implementation. It believe it to be substantially less work to implement multi-display-name out of the barn than to implement single-display-name followed later by a refactor to multi-display-name.

This is why I would prefer to do this all at once. But I like the sound of exposing the iterator API as an exploratory path for 2.2.

@sffc
Copy link
Copy Markdown
Member Author

sffc commented Feb 27, 2026

Nota bene: For DisplayNames, I feel that we do not need to look far for feedback on use cases. Intl.DisplayNames is one of the newer ECMA-402 APIs. You can read about its motivation here:

https://github.com/tc39/proposal-intl-displaynames

It was designed as new Intl.DisplayNames("en", { type: "region" }).of("UK") because it was expected that you would want to load many display names for the same language.

@Manishearth
Copy link
Copy Markdown
Member

I feel that we do not need to look far for feedback on use cases

Yes, but I think user feedback on API shape and performance is still useful. I don't think we need it to proceed, but a solution that gives us a concrete thing to put in front of users without necessarily finalizing it is nice IMO.

@sffc
Copy link
Copy Markdown
Member Author

sffc commented May 4, 2026

Following up from a conversation today with @robertbastian and @Manishearth, I wanted to show two more use cases for BoundLocaleDataProvider

Unit formatting with conversion

Since we may not know the output unit when formatting with a usage context, the formatter needs to have access to multiple potential output units, somewhere between 1 and ~10.

A unit formatter could leverage BoundLocaleDataProvider to retain access to the units it might need.

Each formatting operation would involve hitting the ZeroTrie of attributes, which is not free, but I expect it to be small relative to the overall formatting operation.

Instead of storing a BoundLocaleDataProvider, an alternative solution would be to store a SmallVec of payloads. This is probably how we would do this in a world without BLDP, but BLDP still has advantages including stack size, no-alloc. (SmallVec's advantage is no generic on the formatter.)

Getting display names as an iterator

It also occurred to me that a ZeroTrie also offers a fast iteration mechanism. After you've chosen your locale, a function with the following signature would be easy to implement:

impl RegionDisplayNames<B> {
    pub fn iter(&self) -> impl Iterator<(Region, &str)> + '_ { ... }
}

If loading most or all display names, this is even faster than the .of function in ECMA, since we don't need to look anything up in the trie; we just loop over it.


@robertbastian asked that if we implement this, that we also make a macro that adds a default "naive" implementation of the trait. This seems fine.


I also should follow up on how this gets implemented in the provider adapters. Some are easier than others.

The following are trivial: either, empty, filter, fixed.

The two interesting ones:

Forking Providers

I think this works fine by wrapping both children in another forking provider.

  1. BindLocaleDataProvider calls BindLocaleDataProvider on all children
  2. It stores the resulting providers in its fields, returning the result as a BoundLocaleDataProvider
  3. The BoundLocaleDataProvider simply delegates to one child or the other, as usual

Fallback Providers

This is a bit more tricky and is one of the reasons I started with postcard instead of baked. The most efficient solution for lookup speed would be for BoundLocaleFallbackProvider to have a smallvec of child providers, one for each locale in the fallback chain. In most cases we should end up with <=3 locales in the fallback chain that actually have data, so we should be able to usually fit this on the stack.

@Manishearth
Copy link
Copy Markdown
Member

The impact on other providers seems reasonable.

Aside from internal providers, who would have to care about this trait? People doing custom data loading impls, yes? (Which is a power user use case, so it's allowed to be complex)

@sffc
Copy link
Copy Markdown
Member Author

sffc commented May 5, 2026

I plan to implement this for LocaleFallbackProvider because that will be an interesting signal for the benchmarks. I'm not entirely sure what to expect: a higher startup cost but lower runtime cost.

@sffc sffc requested a review from zbraniecki as a code owner May 6, 2026 04:47
@sffc
Copy link
Copy Markdown
Member Author

sffc commented May 6, 2026

I copied the benchmarks into this PR and added ones that use a Box<dyn BoundLocaleDataProvider>.

Details
locale_names/region/baked/all/1by1
                        time:   [16.277 ms 16.392 ms 16.537 ms]
                        change: [-8.4275% -7.2284% -5.9933%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  1 (1.00%) high mild
  3 (3.00%) high severe
locale_names/region/baked/10x10pct/1by1
                        time:   [175.21 µs 177.34 µs 179.89 µs]
                        change: [-12.197% -10.019% -7.5308%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 13 outliers among 100 measurements (13.00%)
  5 (5.00%) high mild
  8 (8.00%) high severe
locale_names/region/baked/all/old_map_struct
                        time:   [5.0773 ms 5.1166 ms 5.1610 ms]
                        change: [-5.8200% -4.2660% -2.8105%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
  3 (3.00%) high mild
  5 (5.00%) high severe
locale_names/region/baked/10x10pct/old_map_struct
                        time:   [53.893 µs 54.137 µs 54.413 µs]
                        change: [-4.6229% -3.3484% -2.1997%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 12 outliers among 100 measurements (12.00%)
  2 (2.00%) low mild
  7 (7.00%) high mild
  3 (3.00%) high severe
locale_names/region/postcard/all/1by1
                        time:   [19.675 ms 20.003 ms 20.395 ms]
                        change: [-8.6210% -6.4702% -4.2571%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 11 outliers among 100 measurements (11.00%)
  1 (1.00%) low mild
  4 (4.00%) high mild
  6 (6.00%) high severe
locale_names/region/postcard/10x10pct/1by1
                        time:   [178.16 µs 181.21 µs 184.71 µs]
                        change: [-3.9104% -1.6390% +0.7084%] (p = 0.18 > 0.05)
                        No change in performance detected.
Found 4 outliers among 100 measurements (4.00%)
  3 (3.00%) high mild
  1 (1.00%) high severe
locale_names/region/postcard/all/boundlocale/typed
                        time:   [10.255 ms 10.348 ms 10.449 ms]
                        change: [-7.1759% -5.5889% -4.0667%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 15 outliers among 100 measurements (15.00%)
  3 (3.00%) low mild
  3 (3.00%) high mild
  9 (9.00%) high severe
locale_names/region/postcard/10x10pct/boundlocale/typed
                        time:   [79.610 µs 80.414 µs 81.480 µs]
                        change: [-5.2711% -3.8400% -2.4071%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 10 outliers among 100 measurements (10.00%)
  8 (8.00%) high mild
  2 (2.00%) high severe
locale_names/region/postcard/all/boundlocale/boxdyn
                        time:   [10.127 ms 10.181 ms 10.250 ms]
                        change: [-4.9644% -4.1112% -3.2546%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  5 (5.00%) high mild
  2 (2.00%) high severe
locale_names/region/postcard/10x10pct/boundlocale/boxdyn
                        time:   [81.931 µs 83.093 µs 84.454 µs]
Found 17 outliers among 100 measurements (17.00%)
  3 (3.00%) low mild
  6 (6.00%) high mild
  8 (8.00%) high severe

In table form:

Data Set Provider Storage Provider Trait Setup Time
all baked 1by1 16.392 ms [16.277 ms, 16.537 ms]
all baked old_map_struct 5.1166 ms [5.0773 ms, 5.1610 ms]
all postcard 1by1 20.003 ms [19.675 ms, 20.395 ms]
all postcard BoundLocale (Typed) 10.348 ms [10.255 ms, 10.449 ms]
all postcard BoundLocale (Dyn Box) 10.181 ms [10.127 ms, 10.250 ms]
10x10pct baked 1by1 177.34 µs [175.21 µs, 179.89 µs]
10x10pct baked old_map_struct 54.137 µs [53.893 µs, 54.413 µs]
10x10pct postcard 1by1 181.21 µs [178.16 µs, 184.71 µs]
10x10pct postcard BoundLocale (Typed) 80.414 µs [79.610 µs, 81.480 µs]
10x10pct postcard BoundLocale (Dyn Box) 83.093 µs [81.931 µs, 84.454 µs]

So it seems adding the Dyn creates only a small percentage change to the performance.

@sffc sffc mentioned this pull request May 6, 2026
15 tasks
@Manishearth

This comment was marked as resolved.

@sffc

This comment was marked as resolved.

@Manishearth

This comment was marked as resolved.

@sffc
Copy link
Copy Markdown
Member Author

sffc commented May 8, 2026

@robertbastian is okay with landing this behind an unstable feature in icu_provider. The displaynames crate will also have an unstable feature that enables the icu_provider unstable feature.

wdyt @Manishearth?

@Manishearth
Copy link
Copy Markdown
Member

I think that's fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants