node:zlib: copy dictionary into owned buffer to prevent use-after-free on detach#30120
node:zlib: copy dictionary into owned buffer to prevent use-after-free on detach#30120robobun wants to merge 2 commits into
Conversation
WalkthroughNative zlib Context now copies and owns user dictionaries (instead of referencing JS buffers); size accounting, init, and teardown were updated accordingly. The JS binding no longer includes ChangesDictionary Ownership & Safety
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
|
Updated 5:53 AM PT - May 25th, 2026
❌ @robobun, your commit e7203f1 has 2 failures in
🧪 To try this PR locally: bunx bun-pr 30120That installs a local version of the PR into your bun-30120 --bun |
There was a problem hiding this comment.
I didn't find any correctness issues — the dupe-on-init / free-on-close pattern is straightforward and matches Node — but since this changes native allocator ownership for a buffer read from the threadpool and drops a codegen values[] slot shared by NativeBrotli/NativeZstd, it's worth a quick human look.
Extended reasoning...
Overview
This PR fixes a use-after-free in node:zlib where NativeZlib.init() stored a borrowed slice into the user's dictionary ArrayBuffer and later read it from the threadpool (on Z_NEED_DICT) or on reset(). Detaching the ArrayBuffer (e.g. .transfer(n)) frees the backing store synchronously, leaving a dangling pointer. The fix copies the dictionary into a bun.default_allocator.dupe()'d buffer in Context.init(), frees it in Context.close() (and defensively at the top of init() for re-init), and removes the now-redundant dictionary entry from the codegen values[] array. Three new subprocess regression tests run under Malloc=1 so ASAN poisons freed ArrayBuffer storage.
Security risks
This is a fix for a memory-safety bug (heap-use-after-free reachable from JS via ArrayBuffer.prototype.transfer). The new code introduces an owned allocation; I verified the only place it's freed is freeDictionary(), which is called from close() (reached via deinit()) and at the start of init(), and that len > 0 is consistently used to distinguish the "" sentinel from a real allocation so the string literal is never freed. The @constCast is safe because dupe() returns mutable memory stored in a []const u8 field. No new attack surface.
Level of scrutiny
Medium-high. The diff is small (~30 logic lines) and the pattern is textbook, but it's native memory management for a buffer that is read off-thread, and it touches the shared generate() codegen helper in zlib.classes.ts (so the dictionary cached-value slot is also removed from NativeBrotli and NativeZstd). I grepped and confirmed nothing references dictionarySetCached/dictionaryGetCached anywhere, so the removal looks safe, but someone familiar with the class codegen should sanity-check.
Other factors
- Not covered by CODEOWNERS.
- The two inline findings are nits (GC
estimatedSize()accounting,test.concurrentconvention) and don't block. - Existing
pending_close/write_in_progressmachinery already serializesclose()against in-flight threadpool work, so freeing the owned dictionary inclose()doesn't introduce a new race relative to the old borrowed-pointer behavior — it strictly removes one.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/js/node/zlib/zlib-dictionary-detach.test.ts`:
- Around line 124-132: The run function currently returns raw stderr which
causes spurious failures on ASAN shards; after awaiting proc.stderr.text() (the
stderr variable derived from proc), strip or filter out the known ASAN startup
banner (e.g., lines that match /WARNING: ASAN.*interferes/ or the single-line
"WARNING: ASAN interferes..." pattern) before returning, so update run to
compute a filteredStderr from stderr and return that filtered value (keeping
function name run and existing proc/bunExe/env usage intact).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: daca7867-c12e-44e4-a849-6ea1fa4033b1
📒 Files selected for processing (3)
src/bun.js/api/zlib.classes.tssrc/bun.js/node/zlib/NativeZlib.zigtest/js/node/zlib/zlib-dictionary-detach.test.ts
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/js/node/zlib/zlib-dictionary-detach.test.ts`:
- Around line 1-165: The new Bun-only test cases (inflateFixture, resetFixture,
deflateResetFixture, run, env/asanOptions and the three test.concurrent blocks)
should be folded into the existing zlib dictionary test file rather than added
as a new file: copy the constants (asanOptions, env), the three fixture strings
(inflateFixture, resetFixture, deflateResetFixture), the run helper function,
and the three test.concurrent invocations into the existing zlib dictionary test
suite, remove duplicate imports (expect/test, bunEnv, bunExe) if already
present, and ensure names like run(), inflateFixture, resetFixture,
deflateResetFixture and the test titles remain unchanged so the assertions and
environment setup work as in the diff; delete the new standalone file after
merging.
- Around line 143-165: Replace the three nearly-identical test.concurrent blocks
with a single parameterized suite using describe.each to iterate the cases;
create an array of cases containing the case name string and the corresponding
fixture variable (inflateFixture, resetFixture, deflateResetFixture) and then
inside the describe.each callback call test.concurrent(name, async () => { const
{ stdout, stderr, exitCode } = await run(fixture); expect stderr/stdout/exitCode
assertions remain unchanged }); this removes duplication while keeping the
existing assertions and uses the existing run function and fixture identifiers.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 0c483aee-0357-452c-80d4-8acfba51889a
📒 Files selected for processing (1)
test/js/node/zlib/zlib-dictionary-detach.test.ts
There was a problem hiding this comment.
Thanks — all four of my earlier points are addressed and the ownership change looks correct to me. Deferring final sign-off to a human since this is native allocator-owned memory read from the zlib threadpool, and the robobun CI comment still shows build-zig failures on b1e0151 that a6f641b (test-only) wouldn't have changed.
Extended reasoning...
Overview
This PR fixes a use-after-free in NativeZlib where Context.dictionary held a borrowed slice into the user's ArrayBuffer, which can be synchronously freed via ArrayBuffer.prototype.transfer() while the threadpool still reads it in inflateSetDictionary() / reset(). The fix copies the dictionary into an allocator-owned buffer (bun.default_allocator.dupe) in Context.init(), frees it in a new freeDictionary() helper called from both init() (re-init) and close(), drops the now-redundant dictionary entry from the codegen values[], and updates estimatedSize() to include the owned bytes. A new ASAN-focused test file spawns three subprocesses under Malloc=1 to exercise the inflate/Z_NEED_DICT, inflateRaw.reset(), and deflate.reset() paths.
Security risks
The PR removes a heap-use-after-free; the replacement is a straightforward owned-copy pattern matching Node.js's std::vector<unsigned char>. I don't see new exposure: the only new allocation is dupe(u8, d) of a buffer whose length came from a JS ArrayBuffer, freed on close()/re-init(), guarded by len > 0 so the static "" sentinel is never passed to free. The @constCast is safe because the slice originates from dupe (mutable) and is only stored as []const u8 for read-side convenience.
Level of scrutiny
This is small (~20 Zig lines + codegen one-liner + tests) and conceptually simple, but it changes allocator ownership for memory that is read off the JS thread by doWorkInflate() on the workpool. Lifetime bugs here manifest as UAF/double-free, so it warrants a maintainer glance even though I believe it's correct. The pending_close/write_in_progress serialization in CompressionStream should prevent close() (and thus freeDictionary()) from racing with an in-flight doWork(), and init() is only reachable from JS during construction, so the freeDictionary() at the top of init() is defensive rather than a live race.
Other factors
All four of my earlier inline comments (estimatedSize accounting, test.concurrent, Windows Malloc=1 gating, detect_leaks=0 for the LSan-via-Malloc=1 interaction) were addressed in 917aecd / c5e19e4 / b1e0151 / a6f641b and are reflected in the current diff. CodeRabbit's remaining suggestions were reasonably declined with precedent. The bug-hunting pass on the latest revision found nothing. The one open question is CI: robobun's status comment still reports build-zig/build-bun failures on b1e0151, and a6f641b only touched the test file, so those need to go green (or be confirmed as infra flake) before merge.
26cbdc6 to
87295e6
Compare
There was a problem hiding this comment.
All earlier feedback (estimatedSize accounting, test.concurrent, Windows Malloc=1 gating, LSan detect_leaks=0) is addressed and the fix itself reads correctly to me — leaving final sign-off to a maintainer since this changes native allocator ownership for a buffer read from the threadpool.
Extended reasoning...
Overview
The PR fixes a heap-use-after-free in node:zlib by changing Context.dictionary from a borrowed slice into the user's ArrayBuffer to an allocator-owned copy (bun.default_allocator.dupe), freed in close() via a new freeDictionary() helper. The now-redundant dictionary entry is removed from the codegen values[] in zlib.classes.ts, and estimatedSize() is updated to account for the owned bytes. A new ASAN-targeted test file exercises the three affected paths (threadpool Z_NEED_DICT, inflateRaw.reset(), deflate.reset()) under Malloc=1.
Security risks
This is a memory-safety hardening — it eliminates a UAF rather than introducing attack surface. The new code path is a straightforward dupe-on-init / free-on-close with a len > 0 guard so the "" sentinel is never freed; close() is reached from deinit() so the copy is released even if zlib init fails (mode becomes .NONE but freeDictionary() still runs after the switch). I don't see new injection, double-free, or leak risks.
Level of scrutiny
Moderate-to-high: the diff is small and the approach mirrors Node.js (ZlibContext::dictionary_ as std::vector), but it is native Zig touching allocator lifecycle for data consumed on a worker thread, and it removes a codegen-cached value. That's the kind of ownership change a Bun maintainer should explicitly sign off on rather than have auto-approved.
Other factors
All four of my earlier inline comments and all CodeRabbit comments are resolved in the current diff (estimatedSize includes dictionary.len; tests use test.concurrent; Malloc=1 is gated off Windows; ASAN_OPTIONS appends allow_user_segv_handler=1:symbolize=0:detect_leaks=0). The bug-hunting system found no issues on this revision. CI has been retriggered after a Windows agent workspace race; no outstanding review threads remain.
There was a problem hiding this comment.
All earlier feedback is addressed and the ownership change looks correct to me, but since this alters native memory lifetime for a buffer read from the threadpool, I'd like a human to give it a final once-over before merge.
Extended reasoning...
Overview
This PR fixes a use-after-free in node:zlib where NativeZlib.init() stored a borrowed slice into the user's dictionary ArrayBuffer and later dereferenced it from the threadpool (on Z_NEED_DICT) and on reset(). Detaching the backing ArrayBuffer (e.g. transfer(n)) freed the storage out from under the native handle. The fix copies the dictionary into an allocator-owned buffer in Context.init(), frees it in Context.close() / on re-init via a new freeDictionary() helper, drops the now-redundant dictionary cached value from the codegen values[], and updates estimatedSize() to account for the owned copy. A new ASAN-focused test file spawns three subprocesses (inflate Z_NEED_DICT, inflateRaw.reset(), deflate.reset()) under Malloc=1 to lock in the regression.
Security risks
The change is itself a memory-safety hardening. I walked the new lifecycle: dupe() on init, freeDictionary() is idempotent (guards on len > 0 and resets to ""), close() always frees (including the mode = .NONE early-init-failure path via deinit() -> stream.close()), and the @constCast is safe because the slice originated from dupe(). I don't see a new UAF, double-free, or leak introduced. No auth/crypto/permission surface is touched.
Level of scrutiny
Moderate-to-high. node:zlib is a hot, widely-used module and the dictionary buffer is read off-thread, so ownership/lifetime changes here deserve a second pair of eyes even when the diff is small. The Zig change is only ~20 lines and mirrors Node's std::vector approach, which keeps the review surface manageable.
Other factors
All prior review threads (my estimatedSize and test.concurrent nits, my Windows Malloc=1 / LSan detect_leaks=0 blocker, and CodeRabbit's ASAN-banner / file-placement / .each suggestions) are resolved in the current revision, and the bug-hunting pass found nothing new. No CODEOWNERS cover these paths. CI has been retriggered a couple of times for infra reasons (agent-pool expirations, Windows workspace race) rather than test failures. Given the native-memory + threadpool combination I'm deferring rather than shadow-approving.
There was a problem hiding this comment.
All prior feedback is addressed and the fix looks correct to me, but since this changes native buffer ownership/lifetime on a path that's read from the threadpool, I'd like a human to sign off rather than auto-approving.
Extended reasoning...
Overview
This PR fixes a heap-use-after-free in NativeZlib: the user-supplied dictionary ArrayBuffer was previously borrowed as a raw slice and read lazily from the threadpool (on Z_NEED_DICT) and on reset(), but ArrayBuffer.prototype.transfer() can synchronously free the backing store. The fix copies the dictionary into an allocator-owned buffer in Context.init(), frees it in close() via a new idempotent freeDictionary() helper, removes the now-redundant dictionary entry from the codegen values[], and updates estimatedSize() to account for the owned bytes. A new ASAN-focused test file exercises the inflate/Z_NEED_DICT, inflateRaw reset, and deflate reset paths under Malloc=1.
Security risks
The change is itself a memory-safety hardening (eliminates a UAF reachable from user JS). I don't see new risks introduced: freeDictionary() guards on len > 0 and resets to "" so it's safe to call repeatedly; close() is already serialized against in-flight threadpool work via the existing write_in_progress/pending_close machinery in CompressionStream, so the new free inherits that protection; and init() is only invoked once from the JS wrapper, so the defensive pre-free there is harmless. The @constCast is sound because the slice originated from allocator.dupe(u8, ...).
Level of scrutiny
This is native Zig memory-ownership code on a path that is read concurrently from a worker thread. Even though the diff is small (~20 LoC of logic) and mirrors Node.js's std::vector approach exactly, lifetime changes around threadpool-accessed state are the kind of thing where a second pair of human eyes is cheap insurance. That pushes me to defer rather than auto-approve.
Other factors
All earlier review feedback (mine and CodeRabbit's) has been addressed and resolved: estimatedSize() now includes the dictionary length, tests use test.concurrent, Malloc=1 is gated off Windows, and detect_leaks=0 / allow_user_segv_handler=1 are appended to the subprocess ASAN_OPTIONS. The bug-hunting system found no issues on the current revision. The remaining commits since the last substantive change are CI retriggers only.
…e on detach Context.dictionary borrowed the user-supplied dictionary ArrayBuffer's backing store (RawSlice) and read it lazily from the threadpool (inflateSetDictionary on Z_NEED_DICT) and on reset(). Caching the JS view object does not prevent the underlying ArrayBuffer from being detached - e.g. ArrayBuffer.prototype.transfer(newLength) synchronously frees the old backing store, leaving the native handle with a dangling pointer. Copy the dictionary into an owned Box<[u8]> in Context::init() and drop it in close(), matching Node.js (ZlibContext::dictionary_ is a std::vector<unsigned char>). The JS-side cached reference is no longer load-bearing and is not set anymore.
0838d6e to
066289e
Compare
…onary SAFETY comments estimated_size() is invoked from JSC's concurrent GC marker threads, so reading Context through JsCell there would violate the single-JS-thread aliasing invariant. Revert to the constant estimate. The setDictionary/inflateSetDictionary SAFETY comments now describe the owned Box<[u8]> instead of a rooted ArrayBuffer borrow.
There was a problem hiding this comment.
All prior feedback is addressed and the fix looks correct to me, but since this changes buffer ownership across an FFI/threadpool boundary in native code, it's worth a human sign-off.
Extended reasoning...
Overview
This PR fixes a use-after-free in node:zlib by changing Context.dictionary from a borrowed bun_ptr::RawSlice<u8> (pointing into the user's JS ArrayBuffer) to an owned Box<[u8]> copied at init() time. The change touches src/runtime/node/zlib/NativeZlib.rs (~30 lines of actual logic, plus updated field/SAFETY comments and removal of the now-unneeded js::dictionary_set_cached call) and adds test/js/node/zlib/zlib-dictionary-detach.test.ts with three ASAN-focused subprocess fixtures covering the threadpool Z_NEED_DICT path, inflateRaw.reset(), and deflate.reset().
Security risks
The PR removes a memory-safety hazard rather than introducing one — the previous borrowed-slice design could be turned into a heap-use-after-free via ArrayBuffer.prototype.transfer(). The new owned-copy model matches Node.js (ZlibContext::dictionary_ is a std::vector<unsigned char>) and is strictly safer. No auth/crypto/permissions surface is touched. The owned buffer is freed in close() and on re-init() (via assignment drop), so there's no leak.
Level of scrutiny
This is native Rust code that hands raw pointers across FFI to zlib and is read from the WorkPool thread, so it warrants careful review even though the diff is small. The change itself is mechanical (RawSlice → Box<[u8]>, RawSlice::new(d) → Box::from(d), .slice() → deref), and the PR has been through three weeks of iteration with all review feedback addressed — including reverting the estimated_size() change for GC-marker-thread safety and updating the three stale SAFETY comments in e7203f1. I'm confident it's correct, but memory-ownership changes in FFI/threadpool-adjacent runtime code are exactly where a second pair of human eyes is cheap insurance.
Other factors
The bug-hunting system found no issues in the current revision. All inline review threads (mine and CodeRabbit's) are resolved. The CI failures reported against 066289e (musl build-bun, Windows transpiler.test.js, asan complex-workspace.test.ts) are unrelated to zlib and appear to be pre-existing flakes. No CODEOWNERS cover these paths.
|
Status for reviewers: the fix is verified green on the lanes that exercise it.
All review threads are addressed and resolved. Ready for a maintainer to merge. |
|
Closing as superseded: main commit cd1ad59 (part of #31339, "Hardening round 4") independently applies the same fix — That makes this PR's src change a no-op rebase conflict against an identical fix, so there's nothing left for it to deliver. |
Problem
Context.dictionaryborrows the user-supplied dictionary ArrayBuffer's backing store (bun_ptr::RawSlice<u8>insrc/runtime/node/zlib/NativeZlib.rs):and reads it later from the threadpool in
do_work_inflate()(wheninflate()returnsZ_NEED_DICT) and onreset()(viaset_dictionary()).Caching the JS view object (
js::dictionary_set_cached) keeps the view alive but does not prevent the underlying ArrayBuffer from being detached.ArrayBuffer.prototype.transfer(newLength)with a different length reallocates and synchronously frees the old backing store (viaGigacage::freeinJSC::arrayBufferCopyAndDetach), leaving the native handle with a dangling pointer.Repro
Under ASAN with
Malloc=1(routes JSC ArrayBuffer allocations through system malloc so freed regions are poisoned):The same stale pointer is read on
reset()for DEFLATE/DEFLATERAW/INFLATERAW (viadeflateSetDictionary/inflateSetDictionary), and on the GUNZIP multi-member path which callsreset()from the threadpool.Fix
Copy the dictionary into an owned
Box<[u8]>inContext::init()and drop it inclose(), matching Node.js (whereZlibContext::dictionary_is astd::vector<unsigned char>). The JS-side cached reference is no longer load-bearing and is no longer set.estimated_size()now includes the owned copy.Verification
test/js/node/zlib/zlib-dictionary-detach.test.tsspawns subprocesses withMalloc=1and exercises:Z_NEED_DICTon the threadpoolinflateRaw.reset()after detachdeflate.reset()after detach (also verifies the compressed output round-trips with a clean copy of the dictionary)Before fix (
bun bd, ASAN, at current main): all three subprocesses abort withAddressSanitizer: heap-use-after-free.After fix: all pass. Existing
test-zlib-dictionary.js,test-zlib-dictionary-fail.js, and the zlib test suite (minus two pre-existing brotli/zstd streaming timeouts on debug-ASAN that also fail on main) pass.cargo clippy -p bun_runtimeandcargo fmtare clean.Related but separate: #28250 (in/out buffers in
write(), since superseded by thependingInput/pendingOutputcached values), #30118 (writeStatebuffer ininit()).