Skip to content

Improve performance of toUpperCase and toLowerCase#194

Merged
dain merged 2 commits into
masterfrom
user/dain/update-slice-utf8-touppercase
Apr 3, 2026
Merged

Improve performance of toUpperCase and toLowerCase#194
dain merged 2 commits into
masterfrom
user/dain/update-slice-utf8-touppercase

Conversation

@dain
Copy link
Copy Markdown
Member

@dain dain commented Apr 2, 2026

The prior ascii fast path optimization was only applied to toLowerCase, and this PR extends this to toUpperCase. Additionally, when there is mixed ascii and non-ascii, we can add a similar optimization where we skip the expensive per character decode for the ascii sequences. For inputs like öhello this results in a ~50% speed up.

dain added 2 commits April 2, 2026 14:50
Fast-path unchanged ASCII runs inside translateCodePoints so case
conversion no longer decodes every ASCII byte after crossing into the
generic non-ASCII path.

This specifically improves mixed ASCII/non-ASCII inputs such as
öhello/ÖHELLO, where a non-ASCII code point is followed by a long ASCII
tail that is already in the target case.

In the targeted JMH benchmark with repeatCount=1024,
mixed_non_ascii_ascii_noop improved from 13.9 us/op to 6.4 us/op for
both toLowerCase and toUpperCase, about a 53% speedup,
while mixed_non_ascii_ascii_change stayed roughly flat at around
20.6 us/op.
@dain dain assigned wendigo and unassigned wendigo Apr 2, 2026
@dain dain requested review from electrum and wendigo April 2, 2026 22:51
@dain dain merged commit f72c712 into master Apr 3, 2026
2 checks passed
@dain dain deleted the user/dain/update-slice-utf8-touppercase branch April 3, 2026 23:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants