Skip to content

Fix panic converting over-long Windows paths to UTF-16#31694

Merged
Jarred-Sumner merged 19 commits into
mainfrom
farm/b67ae5f9/fix-kernel32-path-overflow
Jun 3, 2026
Merged

Fix panic converting over-long Windows paths to UTF-16#31694
Jarred-Sumner merged 19 commits into
mainfrom
farm/b67ae5f9/fix-kernel32-path-overflow

Conversation

@robobun

@robobun robobun commented Jun 2, 2026

Copy link
Copy Markdown
Collaborator

Fixes a Windows crash reported via crash telemetry (BUN-39DM): Panic: range end index 49151 out of range for slice of length 49150, in bun_paths::string_paths::to_kernel32_pathconvert_utf8_to_utf16_in_buffer, reached from node:fs (PathLike::os_path_kernel32).

This is the same bug as #20258 (fs.existsSync crashing with index out of bounds: index 49151, len 49151 for paths of 49151–98302 chars — the Zig-side manifestation of the identical overflow); its exact repro is included in the tests.

Fixes #20258

Repro

On Windows, any node:fs call with an ASCII path between 32767 and 98302 characters, e.g.:

fs.existsSync("C:\\" + "a".repeat(49200));        // panic (osPathKernel32)
fs.mkdirSync("C:\\" + "a".repeat(40000));          // panic (WPathBuffer)

Reproduced on Linux through the conversion itself (it's platform-independent slice math) with the Windows buffer shape:

panic: convert_utf8_to_utf16_in_buffer: buf too small (have 49146 u16 for 49203 input bytes)

(49146 = the 49151-u16 view of the 98302-byte PathBuffer, minus the \\?\ prefix and NUL — the same arithmetic as the 49150/49151 in the crash report, which predates the assert.)

Cause

Path validation at the JS boundary (Valid::path_string_length) bounds the UTF-8 byte length against MAX_PATH_BYTES (98302 on Windows), but the converted path is written into fixed-size UTF-16 buffers: WPathBuffer holds 32767 units and os_path_kernel32's u16 view of PathBuffer holds 49151. One UTF-16 unit can take up to 3 UTF-8 bytes, so the byte bound admits ASCII paths of up to 98302 units. simdutf's convert API takes only an output pointer — it writes however many units the input needs — so over-long input either trips the capacity assert or (in older builds) writes past the slice and panics slicing the result. The Zig original has the same latent bug (silent overflow); the port surfaced it.

Fix

  • strings::fits_in_wide_path_buffer (src/paths/string_paths.rs): can the path's UTF-16 form — plus the longest converter prefix (\??\UNC\), a trailing slash, and the NUL — fit PATH_MAX_WIDE (32767) units? No path that fails this can exist on NT, so nothing valid is rejected. O(1) for normal paths; the simdutf (SIMD) UTF-16 length is only computed when the byte length exceeds the limit. On malformed bytes that length is an estimate, so such over-long input can pass the check — the bounds-checked conversion below stays the backstop and fails safe.
  • PathLike::slice_w / os_path / os_path_kernel32 (src/runtime/node/types.rs) return Result<_, NameTooLong>, checking the normalized input before converting; call sites map it per-op the way the Zig-side fix for this crash (fix(windows): fs.existsSync/access/mkdir crash on paths of 49151-98302 bytes #27775) did: existsfalse ("over PATH_MAX_WIDE — can't exist on disk"), recursive mkdirENAMETOOLONG with syscall: "mkdir", copyFile/cpENAMETOOLONG with syscall: "copyfile", each with the offending path attached. Rejecting at the conversion call sites rather than generically at the PathLike parse boundary keeps per-op error attribution — in particular fs.watch's own over-long-path handling (syscall: "watch" from its cwd-join check) stays intact.
  • try_convert_utf8_to_utf16_in_buffer (src/bun_core/string/immutable.rs): checked conversion that never writes past the output buffer. simdutf is only entered when the output provably fits, and the WTF-8 fallback re-checks capacity per write — stray continuation bytes expand to one U+FFFD each, exceeding the simdutf length estimate, a hole the previous assert missed. The panicking convert_utf8_to_utf16_in_buffer now delegates to it, contract unchanged.
  • to_w_path_maybe_dir (src/paths/string_paths.rs, the chokepoint for all wide-path conversions) fails safe to an empty path for over-long input instead of overflowing, so internal non-PathLike callers (resolver, install, watcher, …) get a clean syscall error rather than a panic; same guard for the two copy-based normalizers. This mirrors the existing slice_z_with_force_copy handling of over-long POSIX paths.
  • Rust unit tests in src/paths/string_paths.rs exercising the Windows buffer shapes on every platform (cargo test -p bun_paths; they also run in the rust:miri CI lane — the simdutf externs are satisfied by pure-Rust scalar stubs in the test module since the C++ implementation only exists in the full binary link).

Cargo.lock: adds the bstr entry for bun_bin that is already declared in its Cargo.toml on main (any cargo invocation regenerates it).

Verification

Tests:

  • All platforms: cargo test -p bun_paths — exact-capacity fill, over-long ASCII and invalid WTF-8 failing safe to empty, multi-byte input longer in bytes than the buffer converting, kernel32 prefixing, the production crash shape (49151-u16 buffer → prefix-only fail-safe) and a just-under-the-limit conversion (guards against over-rejection), plus fits_in_wide_path_buffer's bounds. Fails on the unfixed conversion: with the chokepoint guard reverted, the three over-long tests panic (buf too small (have 49146 u16 for 49203 input bytes)).
  • Windows lane (test/js/node/fs/fs-path-length.test.ts, which already covers the sibling normalizePathWindows hardening): existsSync returns false instead of crashing (including the exact fs.existsSync crashes with specific path lengths #20258 boundary lengths 49150/49151/64503/98302/98303); recursive mkdirSync, copyFileSync (src and dest), cpSync, fs.promises.mkdir, and Buffer paths all throw ENAMETOOLONG; a multi-byte path long in bytes (90152) but within the UTF-16 bound (30152 units, split into components under NTFS's 255-unit limit) keeps working.
  • test/js/node/watch/fs.watch.test.ts — the over-long relative path test that caught the first iteration's parse-boundary approach in CI — passes unchanged with the call-site design.

cargo check passes for linux-gnu and windows-msvc targets; clippy clean on the four touched crates; bun bd test test/js/node/fs/fs-path-length.test.ts, fs-mkdir.test.ts, glob/path-length.test.ts, and test/js/node/path/ all pass.

CI status

The previous build (59909) was green on everything this diff touches — all build lanes, all 16 Windows x64/x64-baseline test shards (the lanes that run the fs-path-length.test.ts Windows suite), and cargo miri test — with six red jobs all traced to runner problems (a streams chunk-coalescing timing flake on debian-13; broken Docker daemon, legacy Docker CLI, a Squid container, and a full MinIO disk on the macOS agents). Details in the triage comment. A fresh build is running for the latest commit.

Path validation bounded only the UTF-8 byte length (MAX_PATH_BYTES,
98302 on Windows), but node:fs converts paths into fixed-size UTF-16
buffers: WPathBuffer holds 32767 units and osPathKernel32's view of
PathBuffer holds 49151. An ASCII path of 32767..98302 chars passed
validation and the conversion wrote past the wide buffer — simdutf's
convert API takes no output length — panicking with e.g. "range end
index 49151 out of range for slice of length 49150" instead of
returning an error.

- Valid::path_utf16_units rejects paths over 32757 UTF-16 units with
  ENAMETOOLONG at the PathLike boundary on Windows (strings, Buffers,
  and the async slice path), mirroring the existing byte-length check.
- try_convert_utf8_to_utf16_in_buffer: checked conversion that never
  writes past the output buffer, including in the WTF-8 fallback where
  stray continuation bytes expand to one U+FFFD each (a case the old
  assert missed). The panicking variant now delegates to it.
- to_w_path_maybe_dir and the copy-based normalizers fail safe to an
  empty path for over-long input, so internal (non-PathLike) callers
  get a clean syscall error instead of a panic.
- pathsInternals.toKernel32Path testing hook exercises the Windows
  buffer shape on every platform.

Cargo.lock: adds the bstr entry for bun_bin already declared in its
Cargo.toml.
@coderabbitai

coderabbitai Bot commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Warning

Review limit reached

@robobun, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 42 minutes and 48 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 4d540e3b-a26d-4d5f-bcf1-04d3a9899124

📥 Commits

Reviewing files that changed from the base of the PR and between 1c0a402 and 6ca4ca9.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (5)
  • src/bun_core/string/immutable.rs
  • src/paths/string_paths.rs
  • src/runtime/node/node_fs.rs
  • src/runtime/node/types.rs
  • src/sys/lib.rs

Walkthrough

Adds defensive UTF-8→UTF-16 conversion checks and a fallible NameTooLong contract, updates path normalization/resolution to fail safely on buffer overflow, and maps conversion failures to ENAMETOOLONG or benign results across filesystem APIs; includes Windows-focused tests for boundary behavior.

Changes

Windows UTF-16 Path Buffer Overflow Prevention

Layer / File(s) Summary
UTF-8→UTF-16 conversion buffer safety
src/bun_core/string/immutable.rs, src/bun_core/string/immutable/unicode.rs
Introduces try_convert_utf8_to_utf16_in_buffer with empty-input handling, pre-fit checks, bounds-safe SIMD and WTF-8 fallback paths, and element_length_wtf8_with_replacement_into_utf16; convert_utf8_to_utf16_in_buffer and convert_utf8_to_utf16_in_buffer_z use the checked helper and handle failures safely.
Path validation, normalization, and test stubs
src/paths/string_paths.rs, src/paths/Cargo.toml
Adds fits_in_wide_path_buffer, switches conversions to the checked API with fail-safe empty WStr on overflow, guards normalization helpers against oversized inputs, and includes test-only simdutf stubs plus unit tests validating boundary and multibyte behavior.
Windows path resolution buffer checks
src/paths/resolve_path.rs
Adds buffer-length guards to resolve_with_external_buf, resolve_with_external_buf_z, resolve_cwd_with_external_buf, and resolve_cwd_with_external_buf_z, returning original input or NameTooLong on overflow to avoid writing past caller buffers.
Error type and PathLikeExt contract
src/runtime/node/types.rs
Adds pub struct NameTooLong and makes slice_w, os_path, and os_path_kernel32 return Result<..., NameTooLong>, gating fast paths and propagating fit checks across cwd-resolution and normalization.
Filesystem operation error handling
src/runtime/node/node_fs.rs, src/sys/lib.rs
Filesystem call sites now catch NameTooLong and return ENAMETOOLONG (or Ok(false) for exists); access() performs a pre-conversion wide-fit check and rejects oversized paths before kernel calls.
Windows-only JS tests
test/js/node/fs/fs-path-length.test.ts
Adds tests that assert existsSync returns false across UTF-16 boundary lengths, multiple fs ops reject ENAMETOOLONG, drive-letter edge cases behave as expected, and multibyte paths within UTF-16 bounds reach syscall errors instead of buffer validation failures.

Possibly related PRs

  • oven-sh/bun#31339: Both PRs modify convert_utf8_to_utf16_in_buffer in src/bun_core/string/immutable.rs by tightening buffer-capacity handling and delegating to checked conversion paths.

Suggested reviewers

  • alii
  • RiskyMH
🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Fix panic converting over-long Windows paths to UTF-16' accurately and concisely describes the primary change—addressing a crash when converting long UTF-8 paths to UTF-16 buffers on Windows.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description check ✅ Passed The PR description comprehensively documents the bug, root cause, fix strategy, and verification across all modified files with precise technical details.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions Bot added the claude label Jun 2, 2026
@robobun

robobun commented Jun 2, 2026

Copy link
Copy Markdown
Collaborator Author
Updated 9:05 PM PT - Jun 2nd, 2026

@robobun, your commit 6ca4ca9 has some failures in Build #60036 (All Failures)


🧪   To try this PR locally:

bunx bun-pr 31694

That installs a local version of the PR into your bun-31694 executable, so you can run:

bun-31694 --bun

@github-actions

github-actions Bot commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

Found 1 issue this PR may fix:

  1. fs.existsSync crashes with specific path lengths #20258 - fs.existsSync crashes with index out of bounds: index 49151, len 49151 on Windows when paths are 49151–98302 bytes, the exact UTF-16 buffer overflow this PR fixes

If this is helpful, copy the block below into the PR description to auto-close this issue on merge.

Fixes #20258

🤖 Generated with Claude Code

@github-actions

github-actions Bot commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

This PR may be a duplicate of:

  1. fix(windows): fs.existsSync/access/mkdir crash on paths of 49151-98302 bytes #27775 - Fixes the same BUN-39DM crash (fs.existsSync/access/mkdir panic on paths of 49151-98302 bytes) with equivalent guards, but targeting the old Zig codebase

🤖 Generated with Claude Code

@robobun

robobun commented Jun 2, 2026

Copy link
Copy Markdown
Collaborator Author

Re the duplicate-checker: #27775 is the pre-port fix for the same crash (BUN-39DM / #20258), written against the old Zig tree (src/bun.js/node/*.zig, src/string/immutable/paths.zig) — those files are no longer compiled after the Rust port, so it can't land as-is. This PR is the port-era equivalent and agrees with its design where they overlap:

  • same fit check (UTF-16 unit bound with a byte-length short-circuit, exact SIMD count only for >32K inputs, headroom for the \\?\ prefix + NUL);
  • same observable semantics: existsSyncfalse ("over PATH_MAX_WIDE — can't exist on disk"), other fs ops → ENAMETOOLONG.

It differs in where the check lives: #27775 made osPathKernel32 fallible and handled error.NameTooLong at its three call sites (access/exists/mkdirRecursive); this PR validates once at the PathLike boundary, which also covers the WPathBuffer routes (osPath/sliceW: cp, copyFile dest, non-recursive mkdir, …) that crash the same way at a lower threshold (32767 units), plus a fail-safe at the conversion chokepoint for non-PathLike internal callers.

If this merges, #27775 can be closed.

@Jarred-Sumner Jarred-Sumner left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you write a Rust test instead?

Comment thread test/js/node/fs/fs-path-length.test.ts Outdated
Comment thread test/js/node/fs/fs-path-length.test.ts Outdated
The generic UTF-16 length check at the PathLike parse boundary broke
fs.watch's error attribution: its over-long-path regression test
expects ENAMETOOLONG with syscall 'watch' from the watcher's cwd-join
check, which the parse-time throw pre-empted with a syscall-less error.

Move the rejection into the conversion helpers instead, mirroring the
approach of the Zig-side fix for the same crash (#27775):

- strings::fits_in_wide_path_buffer: can the path's UTF-16 form (plus
  worst-case prefix and NUL) fit PATH_MAX_WIDE? Byte-count fast path,
  exact SIMD count only for >32K inputs.
- PathLike::slice_w/os_path/os_path_kernel32 return Result<_,
  NameTooLong> and check the normalized input before converting.
- Call sites map it per-op: exists -> false (such a path can't exist
  on disk), mkdir (recursive) -> ENAMETOOLONG syscall mkdir, copyFile
  and cp -> ENAMETOOLONG syscall copyfile, with the offending path
  attached.
Comment thread test/js/node/fs/fs-path-length.test.ts Outdated
Comment thread src/paths/string_paths.rs
Per review: test the conversion bounds with cargo tests in bun_paths
instead of a bun:internal-for-testing binding + subprocess JS test.
simdutf's C++ implementation is only linked into the full binary, so
the two externs the conversion path uses are satisfied with faithful
pure-Rust scalar stubs, which also keeps the tests runnable under Miri
(bun_paths is in the rust:miri crate list).

The tests cover: exact-capacity fill, over-long ASCII and invalid
WTF-8 failing safe to empty (these panic without the chokepoint
guard), multi-byte input longer in bytes than the buffer converting,
kernel32 prefixing, the production crash shape (49151-u16 buffer,
prefix-only fail-safe, just-under-the-limit conversion), and
fits_in_wide_path_buffer's exact bounds.

Also per review: the multi-byte Windows acceptance test now uses a
90003-byte path (past the unit limit in bytes, so the exact UTF-16
length must be computed) asserted via non-swallowing statSync; the
#20258 boundary loop uses Buffer.alloc; a stale comment pointing at
the removed Valid-boundary design now references PathLikeExt.
@robobun

robobun commented Jun 2, 2026

Copy link
Copy Markdown
Collaborator Author

@Jarred-Sumner done — replaced the bun:internal-for-testing binding + subprocess JS test with Rust unit tests in src/paths/string_paths.rs (ce29ae5).

cargo test -p bun_paths couldn't link before because simdutf's C++ implementation only exists in the full binary link, so the test module satisfies the two externs the conversion path uses (simdutf__convert_utf8_to_utf16le_with_errors, simdutf__utf16_length_from_utf8) with faithful pure-Rust scalar stubs — which also keeps the tests runnable under Miri (bun_paths is in the rust:miri crate list, so they run in that CI lane).

Verified fail-before: with the chokepoint guard in to_w_path_maybe_dir reverted, to_w_path_overlong_yields_empty, to_w_path_overlong_invalid_utf8_yields_empty, and to_kernel32_path_overlong_windows_sized_buffer (the production crash shape: 49151-u16 buffer) all panic; with the fix they pass. The Windows-gated node:fs behavior tests in fs-path-length.test.ts are unchanged.

Comment thread test/js/node/fs/fs-path-length.test.ts
Comment thread src/paths/string_paths.rs
stat routes through slice_z (the u8 path), so statSync never reaches
fits_in_wide_path_buffer; copyFileSync checks both paths against it
and surfaces the syscall's ENOENT when the length guard accepts.
Comment thread test/js/node/fs/fs-path-length.test.ts
A single 30000-unit component can fail CopyFileW with
ERROR_FILENAME_EXCED_RANGE (ENAMETOOLONG) on some filesystems, making
the ENOENT assertion ambiguous with the over-rejection it rules out.
150 x 200-unit segments keep every component valid so non-existence is
the only possible syscall failure.

@claude claude Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All prior feedback addressed — no further issues from me. Deferring to a human given the scope (trait-signature change on PathLikeExt, the #[no_mangle] simdutf test stubs, and core Windows path conversion), not because of any outstanding concern.

Extended reasoning...

Overview

Fixes a Windows panic (BUN-39DM / #20258) where node:fs paths between ~32k and ~98k ASCII chars overflowed fixed-size UTF-16 conversion buffers. Touches bun_core::string::immutable (new checked try_convert_utf8_to_utf16_in_buffer), bun_paths::string_paths (new fits_in_wide_path_buffer, fail-safe guards in three normalizers, ~220 lines of Rust unit tests with pure-Rust simdutf stubs), runtime/node/types.rs (PathLikeExt::{slice_w, os_path, os_path_kernel32} now return Result<_, NameTooLong>), runtime/node/node_fs.rs (5 call sites map the new error), a Windows-gated JS test block, and an incidental Cargo.lock regen.

Security risks

None identified. The change strictly narrows accepted input (over-long paths now fail cleanly with ENAMETOOLONG/false instead of overflowing a stack buffer) and the conversion fail-safes degrade to empty strings that the consuming syscall rejects. No new unsafe blocks beyond the test-only #[no_mangle] extern stubs.

Level of scrutiny

Medium-high. This is a real crash fix with good test coverage (Rust unit tests run on all platforms + Miri; Windows-gated JS integration tests), but it (a) changes a trait signature with multiple call sites, (b) refactors a core string-conversion primitive used beyond paths, (c) introduces #[unsafe(no_mangle)] linker-symbol stubs in a #[cfg(test)] module to satisfy simdutf externs — an unusual pattern whose interaction with build configurations a maintainer should sanity-check, and (d) went through a maintainer-requested redesign mid-PR (testing hook → Rust unit tests). None of this is concerning, just beyond what I'd auto-approve.

Other factors

I reviewed this across four prior rounds; every comment is now resolved (the final commit 6026f94 addresses the per-component NTFS limit point on the multi-byte acceptance test). The bug-hunting system found nothing on the current revision. A maintainer (referenced in the thread re: the testing-hook replacement) has already been engaged, so this is ready for their sign-off.

Miri rejects no_mangle calls whose declared and defined signatures
differ nominally, not just in ABI layout (the CI miri lane failed with
'calling a function with return type StubResult passing return place
of type SIMDUTFResult'). Take bun_simdutf_sys as a dev-dependency and
return the real type. Verified with bun run rust:miri -p bun_paths.
Comment thread src/runtime/node/node_fs.rs
Comment thread src/runtime/node/types.rs
sys::access on Windows converts through to_kernel32_path, so over-long
paths fell through to the conversion fail-safe and surfaced as the
wrong error; reject them with ENAMETOOLONG up front like the other
call sites (#27775 covered access too).

slice_z_with_force_copy's Windows absolute branches append a \?prefix or a cwd drive root plus a NUL without bounds checks, so
drive-letter paths in the last few byte lengths below MAX_PATH_BYTES
wrote past the 98302-byte PathBuffer. Skip the fast path when the
appended form cannot fit and let the plain copy (which fits without
the prefix) or the existing too-long fallback handle it.
Comment thread src/runtime/node/types.rs Outdated
The 4-byte budget didn't cover resolve_cwd_with_external_buf_z's
cwd-root prepend when the cwd is a UNC path (roots are arbitrarily
long). Paths that can't exist on NT (> ~32757 UTF-16 units) now skip
the fast path entirely — far below any overflow window — while long
multi-byte paths still qualify via the exact-length arm.
Comment thread src/runtime/node/types.rs Outdated
The cwd filesystem root prepended for missing-drive-letter paths is
arbitrarily long for UNC cwds, so no static budget at the call sites
can bound the copy — a near-MAX_PATH_BYTES single-separator path with
a UNC cwd still wrote past the PathBuffer (including through
os_path_kernel32's separator branch, where the resolver runs before
the fit check). Check inside the resolvers and surface NameTooLong;
slice_z falls through to its plain copy, os_path_kernel32 maps it to
the existing NameTooLong error. Also bound the Z variant's plain
copy + NUL for completeness.
Comment thread src/runtime/node/types.rs
Comment thread src/sys/lib.rs
Comment thread src/paths/resolve_path.rs
- os_path_kernel32's separator branch checks the input before the cwd
  resolution + normalization: UNC-shaped inputs pass through the
  resolver untouched and reached normalize_buf at full MAX_PATH_BYTES
  length, whose root handling writes one past the input.
- sys::access strips an existing \?\ / \??\ prefix before the fit
  check — NodeFS::access routes through slice_z which already
  prepended it, so the budget double-counted the prefix and rejected
  paths just under the limit that previously worked.
- resolve_with_external_buf{,_z} (the non-cwd siblings hardened in the
  previous commit) get the same join bounds check, failing safe to the
  un-joined input since their signatures are infallible.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
test/js/node/fs/fs-path-length.test.ts (1)

31-37: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Async assertions are incorrect and will not catch regressions.

Lines 32 and 36 use expect(async () => await ...).toThrow(), which does not work for async functions. Async functions return promises that reject rather than throwing synchronously, so .toThrow() will never fail—the test will always pass even if the promise rejects with the wrong error or doesn't reject at all.

🐛 Correct async assertion pattern
   it("rejects overly long multi-byte paths in async readFile", async () => {
-    expect(async () => await fs.promises.readFile(cjkPath)).toThrow("ENAMETOOLONG");
+    await expect(fs.promises.readFile(cjkPath)).rejects.toThrow("ENAMETOOLONG");
   });
 
   it("rejects overly long multi-byte paths in async stat", async () => {
-    expect(async () => await fs.promises.stat(cjkPath)).toThrow("ENAMETOOLONG");
+    await expect(fs.promises.stat(cjkPath)).rejects.toThrow("ENAMETOOLONG");
   });
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/js/node/fs/fs-path-length.test.ts` around lines 31 - 37, The async tests
use expect(async () => await fs.promises.readFile(cjkPath)).toThrow(...) which
never properly asserts promise rejections; update both assertions that reference
fs.promises.readFile(cjkPath) and fs.promises.stat(cjkPath) to use the
async-aware pattern—e.g. await
expect(fs.promises.readFile(cjkPath)).rejects.toThrow("ENAMETOOLONG") (or await
expect(...).rejects.toHaveProperty("code", "ENAMETOOLONG") for a more precise
check) so the promise rejection is correctly awaited and asserted against.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/runtime/node/types.rs`:
- Around line 1099-1113: The panic branch in the match inside
PosixToWinNormalizer::resolve_cwd_with_external_buf_z handling currently
discards the error; change the Err(_) arm to capture the error (e.g., Err(e))
and include that error in the panic message so unexpected errors are surfaced
(for example: panic!("Error while resolving path: {:?}", e)), leaving the
NameTooLong branch and the subsequent ZStr::from_buf usage unchanged.

In `@test/js/node/fs/fs-path-length.test.ts`:
- Around line 167-169: The test uses expect(async () => await
fs.promises.mkdir(kernel32Long, { recursive: true })).toThrow("ENAMETOOLONG")
which won't catch async rejections; change the assertion to the async Jest
pattern by awaiting expect on the promise returned by fs.promises.mkdir and
using .rejects.toThrow (e.g., await expect(fs.promises.mkdir(kernel32Long, {
recursive: true })).rejects.toThrow("ENAMETOOLONG")) so the rejection from
fs.promises.mkdir is properly asserted; update the spec named "rejects over-long
paths in async fs.promises.mkdir" accordingly.
- Around line 132-136: The single test "existsSync handles every path length
across the buffer boundaries (`#20258`)" currently loops over multiple lengths and
should be parameterized so each length runs as its own test; replace the
for-loop with test.each (or describe.each) over the array [49150, 49151, 64503,
98302, 98303], creating a separate test case that calls
fs.existsSync(Buffer.alloc(len, "A").toString()) and asserts toBe(false) for
each length, and include the length in the test name for clear failure
diagnostics.

---

Outside diff comments:
In `@test/js/node/fs/fs-path-length.test.ts`:
- Around line 31-37: The async tests use expect(async () => await
fs.promises.readFile(cjkPath)).toThrow(...) which never properly asserts promise
rejections; update both assertions that reference fs.promises.readFile(cjkPath)
and fs.promises.stat(cjkPath) to use the async-aware pattern—e.g. await
expect(fs.promises.readFile(cjkPath)).rejects.toThrow("ENAMETOOLONG") (or await
expect(...).rejects.toHaveProperty("code", "ENAMETOOLONG") for a more precise
check) so the promise rejection is correctly awaited and asserted against.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 622293f3-93e1-4c36-a4b9-a453aaf8aa39

📥 Commits

Reviewing files that changed from the base of the PR and between 8cf3737 and f419002.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (8)
  • src/bun_core/string/immutable.rs
  • src/paths/Cargo.toml
  • src/paths/resolve_path.rs
  • src/paths/string_paths.rs
  • src/runtime/node/node_fs.rs
  • src/runtime/node/types.rs
  • src/sys/lib.rs
  • test/js/node/fs/fs-path-length.test.ts

Comment thread src/runtime/node/types.rs
Comment thread test/js/node/fs/fs-path-length.test.ts Outdated
Comment thread test/js/node/fs/fs-path-length.test.ts
Comment thread test/js/node/fs/fs-path-length.test.ts
Comment thread src/runtime/node/types.rs
The exact UTF-16 length estimate undercounts invalid input (stray
continuation bytes count zero units but become one U+FFFD each in the
WTF-8 fallback), so arbitrary-length garbage Buffer paths passed the
check and could overflow the u8-space copies it guards in slice_z and
os_path_kernel32. A fitting valid path takes at most 3 bytes per unit,
so capping at 3x the unit bound rejects the garbage without
over-rejecting anything valid.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/paths/string_paths.rs`:
- Around line 333-338: The UTF-16 length helper can undercount malformed WTF-8
and let oversized invalid buffers slip through fits_in_wide_path_buffer; update
fits_in_wide_path_buffer to first validate the input UTF-8 (or compute a
fallback-aware unit count that treats each invalid byte as U+FFFD) before
trusting strings::element_length_utf8_into_utf16(utf8), so the slow-path check
correctly detects lengths > MAX_UNITS and returns false for malformed buffers;
adjust logic used by PathLikeExt::slice_w() / os_path_kernel32() accordingly and
add a regression test that constructs vec![0x80u8; crate::PATH_MAX_WIDE -
windows::NT_UNC_OBJECT_PREFIX.len() - 1] (i.e. MAX_UNITS+1 invalid bytes) to
assert fits_in_wide_path_buffer returns false and ENAMETOOLONG behavior is
preserved.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 9f794ce9-0072-4ac6-a54d-effb82897902

📥 Commits

Reviewing files that changed from the base of the PR and between f419002 and 7116ec1.

📒 Files selected for processing (3)
  • src/paths/string_paths.rs
  • src/runtime/node/types.rs
  • test/js/node/fs/fs-path-length.test.ts

Comment thread src/paths/string_paths.rs
robobun added 2 commits June 2, 2026 10:44
fits_in_wide_path_buffer trusted simdutf's UTF-16 length estimate,
which undercounts malformed input (stray continuation bytes count zero
units but convert to one U+FFFD each), so invalid Buffer paths in the
32758..98271-byte window slipped past the unit bound and degraded to
the conversion fail-safe's sentinel error instead of ENAMETOOLONG.
Count with the conversion's own WTF-8-replacement semantics instead —
exact for valid WTF-8 (lone surrogates included) and for garbage.
Comment thread src/paths/resolve_path.rs Outdated
The joined result of resolve{,_cwd}_with_external_buf feeds
normalize_buf, whose UNC-root handling writes one past the input when
the root has no trailing separator (bare \\server\share cwds). The
join bounds allowed the result to fill the buffer exactly; require one
byte of headroom — a join that long can't exist on NT anyway.
Comment thread src/bun_core/string/immutable.rs
The NUL-terminating variant (used by the Windows profilers' path
widening) kept the pre-fix assert + unbounded non-validating simdutf
write; its TODO pointed at the sibling this PR already rewrote.
Over-long input now fails safe to an empty result instead of writing
past the buffer.
@robobun

robobun commented Jun 2, 2026

Copy link
Copy Markdown
Collaborator Author

CI status for build 60036 (head 6ca4ca91, the merge of main) — final: 284 of 286 jobs passed, zero test failures. The binary-size gate passes now that the branch includes main's WebKit upgrade (the earlier "+1 MB" was the stale-base comparison). Every debian (incl. ASAN), Windows, macOS-x64, and darwin-26 lane passed, along with cargo miri test over the new bun_paths unit tests.

The two darwin-14 aarch64 — test-bun shards ended expired — no darwin-14-aarch64 agent ever picked them up through four reschedule cycles, so they never executed (the same suites passed on that lane in this PR's previous run). They need a maintainer to hit retry once that agent pool has capacity, or an override; there is no test output to act on.

Earlier triage of runner problems (Docker daemons, MinIO disk, streams timing flake) is in this comment's history; all of those lanes are green on this build. Review-wise: all threads resolved, and both changes-requested items (Rust tests instead of the JS hook; no scalar Unicode counting — fits_in_wide_path_buffer uses simdutf's SIMD length) are addressed.

@Jarred-Sumner Jarred-Sumner left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not add this non-simd fits_in_wide_path_buffer code. Find a better approach.

Replaces the scalar WTF-8 unit counter with the existing SIMD
element_length_utf8_into_utf16 and deletes the counter. simdutf's length
is exact for valid WTF-8; on malformed bytes it only estimates, so a
malformed over-long path can now pass the fit check — the bounds-checked
conversion downstream still never overflows and fails safe to an empty
path, so such input gets a generic syscall error instead of the precise
ENAMETOOLONG. Valid paths keep identical behavior. The 3x byte cap stays:
no fitting path can exceed it regardless of content.
@robobun

robobun commented Jun 2, 2026

Copy link
Copy Markdown
Collaborator Author

Done in 4d00960 — the scalar WTF-8 unit counter is deleted, and fits_in_wide_path_buffer now uses the existing SIMD length (element_length_utf8_into_utf16 → simdutf's utf16_length_from_utf8):

utf8.len() <= MAX_UNITS
    || (utf8.len() <= 3 * MAX_UNITS
        && strings::element_length_utf8_into_utf16(utf8) <= MAX_UNITS)

Most paths never reach the SIMD call (the byte-length arm short-circuits anything ≤ 32757 bytes, and the 3× cap rejects anything no fitting path could occupy without scanning).

The scalar counter had existed only to make the check exact for malformed bytes, where simdutf's length is an estimate (stray continuation bytes count zero but convert to one U+FFFD each). That exactness isn't needed for safety: the conversion itself is bounds-checked and fails safe to an empty path, which was already the backstop for every internal caller. So a malformed over-long path now surfaces as a generic syscall error instead of a precise ENAMETOOLONG — valid paths (the only ones real code produces) keep identical behavior, net −33/+20 lines.

cargo test -p bun_paths (12/12, also under Miri), clippy/fmt, both-target cargo check, and the fs JS suites all pass.

@Jarred-Sumner Jarred-Sumner merged commit 43fff14 into main Jun 3, 2026
76 of 78 checks passed
@Jarred-Sumner Jarred-Sumner deleted the farm/b67ae5f9/fix-kernel32-path-overflow branch June 3, 2026 22:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fs.existsSync crashes with specific path lengths

2 participants