Skip to content

Revert "remove unused test files"#7915

Closed
sffc wants to merge 1 commit intounicode-org:mainfrom
sffc:restore-testdata
Closed

Revert "remove unused test files"#7915
sffc wants to merge 1 commit intounicode-org:mainfrom
sffc:restore-testdata

Conversation

@sffc
Copy link
Copy Markdown
Member

@sffc sffc commented Apr 25, 2026

This reverts commit 8862cd2.

This is a clean revert of the commit in #7906 that deleted the files @Manishearth was using.

Changelog

N/A

@sffc

This comment was marked as duplicate.

@Manishearth
Copy link
Copy Markdown
Member

Note: I already updated my PR to not need those files.

Don't have a strong opinion on what we should include in test data. Generally I think our manually curated set was intentionally curated. Agree we should discuss it.

@sffc
Copy link
Copy Markdown
Member Author

sffc commented Apr 25, 2026

I included haw a long time ago because it exercises the romanlow date format, and I think, at least at the time, it isn't in the bakeddata, so the test needs this data in order to run in datagen.

I included cs for a similar reason. I feel fairly certain that we had a test for this data at some point, and it's possible that the test got dropped during one of the refactorings of legacy datetime.

The other ML model data files were included in order to exercise more code paths. I'm not sure whether they were actually being used to exercise those code paths.

Not sure about ethiopic-amete-alem.

Copy link
Copy Markdown
Member

@robertbastian robertbastian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My cleanup removed 13MB of unused test text files from the repo. It didn't remove any tests, cause any tests to fail, or resulted in a diff in testdata. There is absolutely no reason to revert it.

I included haw a long time ago because it exercises the romanlow date format, and I think, at least at the time, it isn't in the bakeddata, so the test needs this data in order to run in datagen.

The test does not exist. If you need a specific file in an upcoming PR, just add it back.

I included cs for a similar reason. I feel fairly certain that we had a test for this data at some point, and it's possible that the test got dropped during one of the refactorings of legacy datetime.

I didn't remove "cldr-dates-full/main/cs/ca-gregorian.json" because that test uses it. It just doesn't use "cldr-dates-full/main/cs/timeZoneNames.json".

The other ML model data files were included in order to exercise more code paths. I'm not sure whether they were actually being used to exercise those code paths.

These have been unused since we only generate testdata for Thai models in #3669.

Not sure about ethiopic-amete-alem.

Sure about what? It's obviously unused, otherwise there would have been a data diff. Is it a bug that it's unused, and it should be checked to match ethiopian? Maybe, in which case you're welcome for discovering that, and we should add it back when we fix this.

@Manishearth
Copy link
Copy Markdown
Member

I'm okay with the cleanup overall. Occasionally checking why we have things and deleting if not is good, and it seems like we have a few things here we shouldn't have deleted but "let's remove unused things and readd a few important ones" is an okay way to do it. When reviewing I didn't notice haw was removed, and if things are unused, they're unused.

@Manishearth
Copy link
Copy Markdown
Member

There is one straightforward reason to revert this, though. When reviewing I said I wanted Shane to look at it, you had an urgency on the PR, and I said fine.

My repeatedly-held position is that is is good to land things fast, but the corollary of that is we need to be okay with disagreements in followups. This is a pretty cleanly separable followup.

That said, this is internal code that doesn't affect API; so it really doesn't matter much if we land this PR first, discuss it, and then land something that does it partially, or if we discuss it on this PR, tweak it, and land some subset. The latter option is fewer PRs.

@sffc sffc deleted the restore-testdata branch April 30, 2026 22:45
@sffc
Copy link
Copy Markdown
Member Author

sffc commented Apr 30, 2026

We discussed this and agreed to add back files when they are needed. Also, we opened #7925 to track the ethiopic-amete-alem data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants