Skip to content

ptx: use char counts for before-chunk sizing in get_output_chunks#12685

Open
sylvestre wants to merge 1 commit into
uutils:mainfrom
sylvestre:ptx-fix-panic-multibyte-before
Open

ptx: use char counts for before-chunk sizing in get_output_chunks#12685
sylvestre wants to merge 1 commit into
uutils:mainfrom
sylvestre:ptx-fix-panic-multibyte-before

Conversation

@sylvestre

Copy link
Copy Markdown
Contributor

The max_before_size assert compared against before.len() (byte length) while max_before_size is measured in chars, panicking on multibyte input like 'éé word'. The tail-chunk budget (max_tail_size) had the same byte/char mismatch, shrinking the tail too much and dropping a word that fits. Use char counts in both places, matching the after chunk.

Fixes #10893

@cakebaker

Copy link
Copy Markdown
Contributor

Hm, the linked issue is already fixed. Is this PR still necessary?

@sylvestre

Copy link
Copy Markdown
Contributor Author

It is a follow up

The max_before_size assert compared against before.len() (byte length)
while max_before_size is measured in chars, panicking on multibyte input
like 'éé word'. The tail-chunk budget (max_tail_size) had the same
byte/char mismatch, shrinking the tail too much and dropping a word that
fits. Use char counts in both places, matching the after chunk.

Fixes uutils#10893
@sylvestre sylvestre force-pushed the ptx-fix-panic-multibyte-before branch from ced2a38 to 17ead12 Compare June 7, 2026 19:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ptx <<< "🎉 34054698701234657890123456789 0" panics

2 participants