Skip to content

Simplify test sources#7906

Merged
robertbastian merged 6 commits intounicode-org:mainfrom
robertbastian:simplify
Apr 23, 2026
Merged

Simplify test sources#7906
robertbastian merged 6 commits intounicode-org:mainfrom
robertbastian:simplify

Conversation

@robertbastian
Copy link
Copy Markdown
Member

Split from #7882

Changelog

N/A

@robertbastian robertbastian marked this pull request as ready for review April 23, 2026 10:47
@robertbastian robertbastian requested review from a team, Manishearth and sffc as code owners April 23, 2026 10:47
Copy link
Copy Markdown
Member

@Manishearth Manishearth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just looking at hte first commit, I'm not really sure what's going on. The include_files macro seems great (feel free to land separately if you'd like). I don't quite understand the rest of this "cleanup". What is being changed? Why? What's bad with what we have now?

.unwrap()
.read_and_parse_toml::<SegmenterRuleTable>(rules_file)
.expect("The data should be valid!");
let segmenter =
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: are we not caching anymore? don't we want to cache?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need caching because these are only loaded once, and even if not, the TOML parsing is only a tiny part of the work that's being done in the implementation.

These don't belong in the icuexportdata source, which I'm trying to slim down.

@robertbastian
Copy link
Copy Markdown
Member Author

I believe your question refers to the second commit, because I don't see what you wouldn't understand about the others. The second commit changes the download-repo-sources script to use the existing AbstractFs code, which deals with networking, unzipping, etc., so we don't have to maintain all of that twice.

@Manishearth
Copy link
Copy Markdown
Member

I mean, I was just reviewing the simplify commit in the commit-by-commit view; This PR has very little information on what it is trying to do; I have to piece it together from the code, which is fine but slower, and commit-by-commit view is nicer for that.

@robertbastian
Copy link
Copy Markdown
Member Author

simplify just simplifies the generated code, and makes some small changes here and there like introducing some internal constructors which really shouldn't be controversial

Copy link
Copy Markdown
Member

@Manishearth Manishearth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, I think I see the pertinent changes.

This

  • Moves download-repo-sources into an icu_provider_source "test" so that it can use AbstractFS. I'm not a huge fan of using the test runner for binaries but I guess it works.
  • Uses AbstractFS
  • Adds a macro that makes filepahts betters
  • Stops using AbstractFS for one-off data loads. I'm still not fully sure of this but it's intentional.

Overall seems positive. Want to wait for @sffc's thoughts especially on turning download-repo-sources into a non-test. But I think this is fine.

@robertbastian
Copy link
Copy Markdown
Member Author

I'm not a huge fan of using the test runner for binaries but I guess it works.

We already do this for make-testdata.

@robertbastian
Copy link
Copy Markdown
Member Author

This is blocking other PRs, can you not just approve?

Copy link
Copy Markdown
Member

@Manishearth Manishearth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. For future reference, if you want reviews to be faster, please, please provide some context on what you're doing in the PR body. It was not easy to figure out what this PR was trying to do; and why.

@robertbastian robertbastian merged commit e85cdb2 into unicode-org:main Apr 23, 2026
34 checks passed
@robertbastian robertbastian deleted the simplify branch April 23, 2026 22:52
@Manishearth
Copy link
Copy Markdown
Member

Hmm, this PR removed haw/ca-gregory data, which I was using in #7905 for testing. Was that intentional?

@sffc
Copy link
Copy Markdown
Member

sffc commented Apr 25, 2026

Hmm, this PR removed haw/ca-gregory data, which I was using in #7905 for testing. Was that intentional?

#7915

Copy link
Copy Markdown
Member

@sffc sffc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at the first commit for a while and am not really sure what it is doing but 🤷.

Moving the tool to be a unit test tool is fine with me. We did it with make-testdata already. I consider it a flaw in Rust that there isn't an easy way to export APIs for local users that aren't included in a crates.io release.

The deleting of the "unused" data files should not have been in this PR. They exist for reasons. Maybe those reasons are flawed, but that should be debated in a standalone PR, not in a "refactor to be merged asap" PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants