Skip to content

resolver: reuse BSSMap slots and PackageJSON/TSConfigJSON on bustDirCache#29922

Open
robobun wants to merge 11 commits into
mainfrom
farm/533bc3a0/fix-bustdircache-leak
Open

resolver: reuse BSSMap slots and PackageJSON/TSConfigJSON on bustDirCache#29922
robobun wants to merge 11 commits into
mainfrom
farm/533bc3a0/fix-bustdircache-leak

Conversation

@robobun

@robobun robobun commented Apr 29, 2026

Copy link
Copy Markdown
Collaborator

What

Resolver.bustDirCache() leaked one DirInfo BSSMap slot (and the heap-allocated *PackageJSON / *TSConfigJSON it owned) per call. Triggered on every file change in --hot / --watch, every DevServer hot-reload event, every FileSystemRouter.reload(), and the runtime ENOENT-retry path — so growth was unbounded over a dev session.

Cause

BSSMap.remove() only does self.index.remove(hash); the backing slot stays in backing_buf at its old index with live *PackageJSON / *TSConfigJSON pointers. The next getOrPut() for the same key finds nothing, returns Unassigned, and put() takes a fresh slot (backing_buf_used += 1). dirInfoUncached() then heap-allocates a brand-new PackageJSON / TSConfigJSON. The old slot and its heap objects are never reclaimed.

The same applies to the EntriesOption map (bustEntriesCache goes through the same BSSMap.remove()).

Fix

  • BSSMap.remove() now stashes the old slot in a per-key reclaimable map instead of dropping it on the floor.
  • getOrPut() returns a reclaimed slot with status = .unknown so callers still re-resolve, but put() writes back to the same slot instead of incrementing backing_buf_used.
  • Before overwriting a reclaimed DirInfo, the resolver captures the old package_json / tsconfig_json heap pointers and threads them through to parsePackageJSON / parseTSConfig, which overwrite the existing allocation in place rather than calling .new(). Child DirInfo.enclosing_package_json / enclosing_tsconfig_json pointers keep working because the address is preserved.
  • The two rfs.entries.atIndex(...) checks in resolver.zig now gate on status == .exists so a reclaimed slot is treated as stale (re-iterate the directory, reuse the DirEntry via the in_place path) rather than returning the old listing.
  • hasCheckedIfExists() now tests status != .unknown so a reclaimed-but-unresolved slot reads as "never looked up".

Because reclaim is per-key (not a generic free list), a directory always goes back to the slot it came from. Child DirInfo.parent indices therefore remain valid — this is the concern that #29919 raised when it skipped the dir_cache half of this leak.

Verification

FileSystemRouter.reload() over a 9-directory tree, 4000 iterations:

per-iteration RSS growth
before ~15 KB
after ~2 KB

The residual is unrelated DirnameStore appends along the re-resolve path.

New test: test/js/bun/resolve/bust-dir-cache-leak.test.ts

Related: #29919 handles the *DirEntry / EntryStore side of the same bust cycle with a stale flag on DirEntry; this PR handles the DirInfo / PackageJSON / TSConfigJSON side at the BSSMap layer. They touch the same two generation checks in resolver.zig but are otherwise independent.

@coderabbitai

coderabbitai Bot commented Apr 29, 2026

Copy link
Copy Markdown
Contributor

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds reclaimable backing-store slots to the allocator; changes resolver cache repopulation to reuse prior PackageJSON/TSConfigJSON allocations in-place after a busted slot; and adds a regression test that asserts no RSS growth across repeated FileSystemRouter.reload() loops.

Changes

Allocator: reclaimable backing-store slots

Layer / File(s) Summary
Result semantics
src/allocators.zig
Result.hasCheckedIfExists() now checks status != .unknown instead of inspecting the index sentinel.
State
src/allocators.zig
BSSMap adds reclaimable: std.AutoHashMapUnmanaged(HashKeyType, IndexType), initialized in init() and deinitialized in deinit().
Lookup
src/allocators.zig
getOrPut() now sets Result.status based on stored index and consults reclaimable to return an .unknown Result with a reclaimed index when a parked slot exists.
Put semantics
src/allocators.zig
put() consumes a parked slot via reclaimable.fetchRemove() when available before allocating a new backing index.
Remove semantics
src/allocators.zig
remove() uses index.fetchRemove() and, when removing a real backing slot, best-effort parks the freed index in reclaimable instead of discarding it.
Docs/Notes
src/allocators.zig
markNotFound() comment clarifies parked/reclaimable slots are intentionally preserved across remove/recreate cycles.

Resolver: reuse parsed DirInfo JSON + regression test

Layer / File(s) Summary
DirEntry reuse policy
src/resolver/resolver.zig
In-place reuse of a prior Fs.FileSystem.DirEntry after a busted rfs.entries slot now requires the busted slot’s status to be .exists and entries.generation >= r.generation.
Cache reuse capture
src/resolver/resolver.zig
dirInfoForResolution and dirInfoCachedMaybeLog capture prior dir_cache slot pointers (reuse_package_json, reuse_tsconfig_json) from the existing DirInfo slot.
Uncached flow wiring
src/resolver/resolver.zig
dirInfoUncached signature extended to accept reuse_package_json: ?*PackageJSON and reuse_tsconfig_json: ?*TSConfigJSON and these are forwarded into parsing calls during repopulation.
Parsing with reuse
src/resolver/resolver.zig
parsePackageJSON(..., reuse: ?*PackageJSON) and parseTSConfig(..., reuse: ?*TSConfigJSON) added; when reuse is provided, selected owned heap fields of the prior object are freed, the reused struct is overwritten in place with newly parsed contents, the temporary parse result is destroyed, and the reused pointer is returned to preserve identity.
TSConfig extends handling
src/resolver/resolver.zig
tsconfig.json extends merge logic preserves the reused TSConfigJSON allocation: parent configs are not destroyed if they are the reuse pointer, and merged results are moved into the reused allocation when applicable.
Test: regression
test/js/bun/resolve/bust-dir-cache-leak.test.ts
Adds a test that spawns a subprocess running a Bun.FileSystemRouter reload loop (warmup + 1000 iterations with explicit Bun.gc(true)), measures RSS delta per iteration, and asserts no stderr, exit code 0, and per-iteration growth < 7 KB.
🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: reusing BSSMap slots and PackageJSON/TSConfigJSON allocations to fix the bustDirCache memory leak.
Description check ✅ Passed The description provides comprehensive coverage of what the PR does, why it was needed (the cause), how it fixes the issue, and verification results, exceeding the template requirements.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@robobun

robobun commented Apr 29, 2026

Copy link
Copy Markdown
Collaborator Author
Updated 9:51 AM PT - May 4th, 2026

@robobun, your commit 3baec36 has some failures in Build #51035 (All Failures)


🧪   To try this PR locally:

bunx bun-pr 29922

That installs a local version of the PR into your bun-29922 executable, so you can run:

bun-29922 --bun

@github-actions

Copy link
Copy Markdown
Contributor

Found 3 issues this PR may fix:

  1. PANIC on hot-reload when modifying imported modules (release ,debug) and HUGE memory leak (debug). #8513 - Reports crashes and memory leaks during --hot mode when modifying imported modules; bustDirCache is called on every file change, and leaked BSSMap slots accumulate with each reload
  2. Memory Leak running Typescript server on file reload #15857 - Memory grows ~2-20MB per file save in watch/reload scenarios; each save triggers re-resolution via bustDirCache, leaking resolver cache entries
  3. Newly created files not resolvable by Bun.resolveSync during HMR #27864 - Newly created files not resolvable by Bun.resolveSync during HMR; stale BSSMap slots from improper cache invalidation could prevent new files from being found

If this is helpful, copy the block below into the PR description to auto-close these issues on merge.

Fixes #8513
Fixes #15857
Fixes #27864

🤖 Generated with Claude Code

Comment thread src/resolver/resolver.zig
Comment thread src/bun_alloc/bun_alloc.zig
Comment thread src/resolver/resolver.zig
Comment thread test/js/bun/resolve/bust-dir-cache-leak.test.ts Outdated
Comment thread src/resolver/resolver.zig
Comment thread src/resolver/resolver.zig
Comment thread src/bun_alloc/bun_alloc.zig
Comment thread src/allocators.zig Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/allocators.zig`:
- Around line 697-700: The code is silently dropping errors from
reclaimable.put() (using catch {}), which can leak orphaned slots; replace the
silent catch with explicit error handling: call bun.handleOom() if the error is
error.OutOfMemory, otherwise propagate the error (or convert to a fast-fail)
instead of swallowing it. Update the block around self.index.fetchRemove(...)
and the reclaimable.put(self.allocator, _key, kv.value) call to handle the
returned error (e.g., catch |err| { if (err == error.OutOfMemory)
bun.handleOom(); else return err; }) so the function fails fast on allocation
failures and doesn’t ignore other errors.

In `@src/resolver/resolver.zig`:
- Around line 2702-2729: The reuse branch currently frees some PackageJSON
fields but skips releasing the exports/imports trees and scripts/config, causing
leaks; implement and call a recursive deallocator (e.g.,
PackageJSON.deinitRecursive or freePackageJSON(prev, alloc)) that walks and
frees all owned substructures (prev.exports, prev.imports, prev.scripts,
prev.config and any nodes in their tree/map types, plus the already-handled
main_fields, browser_map, dependencies.map and side_effects variants) before
doing prev.* = pkg; accept an allocator parameter (use bun.default_allocator)
and use defer inside the helper where appropriate to guarantee cleanup on error.

In `@test/js/bun/resolve/bust-dir-cache-leak.test.ts`:
- Around line 1-72: Move the new "bustDirCache reuses DirInfo and EntriesOption
slots across repeated reloads" test into the existing resolve test file instead
of creating a new standalone file; copy the entire test body (including imports
expect/test and bunEnv, bunExe, tempDir usage and the spawn/run.ts logic that
exercises Bun.FileSystemRouter.reload()) into the existing resolve tests, remove
the new file, ensure the test name and timeout (30_000) remain unchanged, and
verify imports don't duplicate existing ones in that file so the test compiles
and runs as part of the existing resolve test suite.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 9c9e3355-6d39-4a0c-a38c-5d1b21b4a00e

📥 Commits

Reviewing files that changed from the base of the PR and between d4cd11c and 9628720.

📒 Files selected for processing (3)
  • src/allocators.zig
  • src/resolver/resolver.zig
  • test/js/bun/resolve/bust-dir-cache-leak.test.ts

Comment thread src/bun_alloc/bun_alloc.zig
Comment thread src/resolver/resolver.zig
Comment thread test/js/bun/resolve/bust-dir-cache-leak.test.ts
Comment thread src/bun_alloc/bun_alloc.zig
Comment thread src/resolver/resolver.zig
@robobun robobun force-pushed the farm/533bc3a0/fix-bustdircache-leak branch from fb17145 to f221d3a Compare May 3, 2026 07:53
Comment thread src/resolver/resolver.zig
Comment thread src/resolver/resolver.zig
Comment thread src/resolver/resolver.zig
@robobun robobun force-pushed the farm/533bc3a0/fix-bustdircache-leak branch from 786f3db to a2f2234 Compare May 3, 2026 09:26

@claude claude Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All prior feedback looks addressed and the latest pass found nothing new, but this is a non-trivial memory-lifecycle change in the resolver's core BSSMap/DirInfo caching with cross-entry pointer aliasing invariants (and CI is showing build-zig failures on a2f2234), so it warrants a human sign-off.

Extended reasoning...

Overview

This PR changes BSSMap in src/allocators.zig to park freed slots in a per-key reclaimable map so remove() → getOrPut() → put() reuses the same backing slot, and threads previous *PackageJSON / *TSConfigJSON heap pointers through dirInfoUncached / parsePackageJSON / parseTSConfig in src/resolver/resolver.zig so they are overwritten in place instead of leaked on every bustDirCache(). It also adjusts the tsconfig extends merge loop to preserve the reused allocation's address, gates two rfs.entries cache-hit checks on status == .exists, and adds an RSS-growth regression test.

Security risks

None identified. No auth, crypto, network, or untrusted-input handling is touched. The risk class here is memory safety (UAF via enclosing_package_json / enclosing_tsconfig_json aliasing), not security exposure — and one such UAF regression (tsconfig extends + reuse) was caught and fixed during review in bd4dfed.

Level of scrutiny

High. BSSMap is a foundational data structure backing both dir_cache and fs.entries, and the resolver is on the critical path for every import resolution in --hot / --watch / DevServer / bundler. The change relies on a subtle invariant — child DirInfo entries hold raw pointers into parent slots, so reuse must preserve addresses and never free aliased allocations. The PR went through eight rounds of inline review that surfaced and fixed a real UAF regression, several incomplete-cleanup leaks, and a put()-after-markNotFound() slot-orphaning edge case. That iteration history alone suggests this is not a mechanical change.

Other factors

  • The robobun CI status comment reports build-zig failures on six platforms for commit a2f2234 (the most recent substantive commit before the retrigger gate push). I can't verify whether a3a600d cleared them.
  • All prior inline comments (mine and CodeRabbit's) are marked resolved, and the author has documented the accepted residual gaps (ExportsMap nodes, tsconfig file-text buffer, jsx.import_source, scripts/config) as follow-ups.
  • The new test exercises the slot-reuse path but does not cover the package.json / tsconfig.json reuse branches or the extends merge fix — the author noted ASAN verification for the latter, but a maintainer should confirm that's sufficient.
  • The change interacts with companion PR #29919 (DirEntry/EntryStore side); a human should confirm the two land in a compatible order.

robobun added 11 commits May 4, 2026 10:28
…ache

bustDirCache() calls BSSMap.remove() which only dropped the hash→index
mapping; the backing_buf slot (and its *PackageJSON / *TSConfigJSON heap
pointers) stayed orphaned, and the next getOrPut()+put() for the same key
allocated a fresh slot plus fresh heap objects. Every --hot/--watch file
change, DevServer hot-reload, and FileSystemRouter.reload() burned one
DirInfo slot and one EntriesOption slot per affected directory, plus a new
PackageJSON/TSConfigJSON when the directory had one — unbounded growth
over a dev session.

Fix: BSSMap.remove() now stashes the old slot in a per-key reclaimable map.
getOrPut() returns that slot with status=.unknown so the caller re-resolves,
and put() writes back to the same slot instead of incrementing
backing_buf_used. Before overwriting a reclaimed DirInfo, the resolver
captures the old package_json/tsconfig_json and passes them through to
parsePackageJSON/parseTSConfig, which overwrite the existing heap struct in
place rather than calling .new(). Child DirInfo.enclosing_* pointers keep
working because the address is preserved.

The rfs.entries read path now checks result.status so a reclaimed slot is
treated as stale (re-iterate the directory, reuse the DirEntry in place)
rather than returning the old listing.
… on markNotFound

- parsePackageJSON now frees the previous value's source.contents,
  name/version, main_fields/browser_map backing, dependencies map, and
  side_effects map/glob before overwriting the reused struct, so each
  hot-reload of a directory containing package.json no longer leaks the
  prior file text and hash-map arrays. Matches the paths.deinit() already
  done in parseTSConfig's reuse path.
- markNotFound no longer drops the reclaimable entry: getOrPut
  early-returns on NotFound before consulting reclaimable, so keeping it
  lets a delete→recreate→bust cycle reuse the original slot instead of
  permanently orphaning it.
…st timeout

The captured reuse_package_json/reuse_tsconfig_json is dropped when the
underlying file was deleted (or fails to parse) between busts. That's a
one-off leak per delete event; freeing would UAF through child enclosing_*
aliases, and parking it back on the DirInfo would make the resolver think
the file still exists. Documented the tradeoff instead.
Spawning a debug+ASAN subprocess has several seconds of startup overhead
before the reload loop runs; the default 5s per-test budget wasn't enough,
so the test timed out on both with- and without-fix lanes. 50+1000 reloads
still gives ~14-18 KB/iter without the fix vs ~1-2 KB/iter with it.
When a tsconfig has "extends", the merge loop sets info.tsconfig_json to
the deepest-base allocation and bun.destroy()s every intermediate — which
now includes the reused pointer that child DirInfo.enclosing_tsconfig_json
still aliases. Skip destroying the reused allocation in the loop and copy
the merged result back into it afterward so the address invariant holds
for the extends case too.

Also free the []string value slices in prev.paths before deinit(), matching
the extends-merge cleanup pattern.
getOrPut() early-returns on NotFound without consulting reclaimable, so a
caller that proceeds past .not_found (dirInfoForResolution does) reaches
put() with result.index == NotFound. put() would then allocate a fresh
slot and discard the parked one that markNotFound intentionally kept.
Pull the reclaimable.fetchRemove up front so the parked slot is used
instead.
Each pattern in SideEffects.glob/.mixed.globs is separately duped by
normalizePathForGlob(); ArrayListUnmanaged.deinit() only frees the list
backing, not the item strings. Matches the paths.values() fix already in
parseTSConfig.
browser_map has dupe_keys=false but each key is caller-duped via
allocator.dupe(u8, fs.normalize(...)) in PackageJSON.parse, so
StringMap.deinit() skips them. main_fields keys are static slices from
r.opts.main_fields and must not be freed.
…alloc gaps

parseMemberExpressionForJSX heap-allocates factory/fragment as []string
when set in JSON; guard on jsx_flags since defaults are static slices.
Remaining per-field sub-allocations (jsx.import_source, tsconfig file
text, ExportsMap nodes, scripts/config) need a centralized
deinitOwnedFields() helper and are tracked as follow-ups.
@Jarred-Sumner Jarred-Sumner force-pushed the farm/533bc3a0/fix-bustdircache-leak branch from a3a600d to 3baec36 Compare May 4, 2026 10:28

@claude claude Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No new issues found in this revision, but given the pointer-aliasing invariants this depends on (child enclosing_package_json/enclosing_tsconfig_json must stay valid across in-place overwrites, the extends-merge UAF that was fixed mid-review, the changed hasCheckedIfExists() semantics affecting all BSSMap callers) and the acknowledged residual gaps, this warrants a human pass before merge.

Extended reasoning...

Overview

This PR reworks resolver cache lifecycle to stop leaking one DirInfo BSSMap slot (plus its heap *PackageJSON/*TSConfigJSON) per bustDirCache() call. It touches three areas: src/bun_alloc/bun_alloc.zig (adds a per-key reclaimable map to BSSMap and changes getOrPut/put/remove/hasCheckedIfExists semantics), src/resolver/resolver.zig (threads reuse_package_json/reuse_tsconfig_json through dirInfoUncachedparsePackageJSON/parseTSConfig to overwrite allocations in place, with manual cleanup of ~10 owned sub-allocations, plus reworks the tsconfig extends-merge loop to preserve the reused address), and a new RSS-growth regression test.

Security risks

None identified. This is internal memory management; no parsing of untrusted input was added, no auth/crypto/permissions surface. The risk class here is correctness (UAF / heap corruption) rather than security per se — and one real UAF regression (tsconfig extends + reused allocation being bun.destroy()'d while child DirInfos still aliased it) was caught and fixed during review.

Level of scrutiny

High. This is core resolver infrastructure on the hot path of every --hot/--watch/DevServer reload. The correctness of the fix rests on a subtle invariant — that overwriting a heap struct in place keeps child DirInfo.enclosing_* raw-pointer aliases valid — and that invariant interacts non-trivially with the tsconfig extends-merge loop, the markNotFoundput path, and the changed meaning of hasCheckedIfExists() (which has a third caller in fs.zig that was deliberately left for #29919). The PR went through ~9 rounds of fixes during review (UAF, several incomplete-cleanup leaks, slot orphaning on the NotFound path), which is a signal that the surface is delicate.

Other factors

  • All inline review comments are resolved; the current bug-hunting pass found nothing new.
  • There are acknowledged, documented residual gaps left for follow-up (ExportsMap tree nodes, tsconfig file-text buffer, jsx.import_source, scripts/config) — a human should sign off on that scoping decision.
  • The latest CI build (#50592) shows a build-zig failure on Windows x64 plus three test failures; even if unrelated/flaky, that should be confirmed before merge.
  • This PR is designed to compose with #29919 (the *DirEntry side); a reviewer familiar with both should confirm the interaction at the two shared generation checks in resolver.zig and the untouched fs.zig site.

This is well outside the "simple/mechanical" bar for auto-approval.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant