Fix use-after-free in AggregateError creation for multi-error module loads#31399
Fix use-after-free in AggregateError creation for multi-error module loads#31399robobun wants to merge 1 commit into
Conversation
…eated process_fetch_log collected the per-message BuildMessage/ResolveMessage wrappers in a heap Vec<JSValue> before creating the AggregateError. The conservative GC scan only covers the stack and registers, so a collection triggered while later wrappers were still being created could sweep the earlier ones and free their native payloads; printing or inspecting the resulting AggregateError then read freed memory. Use a stack array for the staging buffer, matching VirtualMachine.zig's errors_stack, so the wrappers stay rooted until createAggregateError runs.
|
Warning Review limit reached
Your plan includes 5 reviews of capacity. Refill in 10 minutes and 56 seconds. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more review capacity refills, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than trial, open-source, and free plans. In all cases, review capacity refills continuously over time. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (2)
Comment |
|
Updated 12:57 PM PT - May 25th, 2026
❌ @robobun, your commit 2f3bf67 has 2 failures in
🧪 To try this PR locally: bunx bun-pr 31399That installs a local version of the PR into your bun-31399 --bun |
|
Found 1 issue this PR may fix:
🤖 Generated with Claude Code |
|
Checked #23181 against this fix: it's most likely a different bug, so I'm not adding a
The symptom overlaps (dynamic imports of modules with errors), but the root cause in #23181 would need its own investigation. |
|
This PR may be a duplicate of:
🤖 Generated with Claude Code |
|
Confirmed — this is a duplicate of #30671, which contains the identical stack-array fix for Closing this one in favor of #30671 to keep a single PR for the fix. |
|
|
||
| const [stdout, stderr, exitCode] = await Promise.all([proc.stdout.text(), proc.stderr.text(), proc.exited]); | ||
|
|
||
| expect(stderr).not.toContain("AddressSanitizer"); |
There was a problem hiding this comment.
🟡 nit: expect(stderr).not.toContain("AddressSanitizer") is the kind of assertion CLAUDE.md says to avoid ("NEVER write tests that check for no 'panic' or 'uncaught exception' or similar in the test output. These tests will never fail in CI."). Non-ASAN CI builds will never emit this string, and on ASAN builds a UAF aborts the subprocess so expect(exitCode).toBe(0) and expect(stdout).toContain("OK 250") already catch it — this line can be dropped.
Extended reasoning...
What this is
The new regression test asserts expect(stderr).not.toContain("AddressSanitizer") against the subprocess output. The repo's root CLAUDE.md (line 126) explicitly prohibits this style of assertion:
NEVER write tests that check for no "panic" or "uncaught exception" or similar in the test output. These tests will never fail in CI.
Checking for the absence of an AddressSanitizer diagnostic falls squarely into the "or similar" bucket — it is a crash-diagnostic-absence check, just like checking for the absence of panic: or uncaught exception.
Why it is redundant
Walk through the two CI configurations:
-
Non-ASAN build (most CI jobs). The subprocess binary was not compiled with
-fsanitize=address, so it is physically incapable of writingAddressSanitizerto stderr regardless of whether the bug is present. The assertion is a tautology and can never fail. This is exactly the failure mode the CLAUDE.md rule warns about. -
ASAN build with the bug present. ASAN detects the heap-use-after-free, prints its report to stderr, and aborts the process with a non-zero exit code before
console.log("OK", aggregate.errors.length)runs. So:expect(exitCode).toBe(0)fails (process aborted), andexpect(stdout).toContain("OK 250")fails (theOKline was never printed).
The regression is already detected by the two assertions that follow; the stderr check adds no coverage.
A grep across test/ confirms no other test file uses .not.toContain("AddressSanitizer"), so this is not following an established local convention.
Impact
None functionally — the test still passes/fails correctly with this line present. It is purely dead weight that contradicts an explicit repo testing convention.
Fix
Delete line 69. The remaining expect(stdout).toContain("OK 250") and expect(exitCode).toBe(0) fully cover the regression. If a more descriptive failure message on ASAN runners is desired, prefer attaching stderr to the exit-code assertion (e.g. expect(exitCode, stderr).toBe(0)) rather than asserting on the absence of a sanitizer string.
| // Spec caps at 256 (`var errors_stack: [256]JSValue`). The array | ||
| // must stay on the stack like the Zig spec: until | ||
| // `create_aggregate_error` runs, these freshly created | ||
| // BuildMessage/ResolveMessage wrappers have no other GC-visible | ||
| // reference, and `create()` can trigger a collection mid-loop. | ||
| // JSC's conservative scan only covers the stack/registers, so a | ||
| // heap `Vec<JSValue>` would let earlier wrappers be swept (and | ||
| // their native payloads freed) before the AggregateError exists. | ||
| let mut errors_stack = [JSValue::ZERO; 256]; |
There was a problem hiding this comment.
🟣 Heads-up: the same heap-Vec<JSValue> UAF pattern this PR fixes still exists in LogJsc::to_js at src/jsc/lib.rs:2146-2151 — let mut errors_stack: Vec<JSValue> = … followed by a loop pushing msg_to_js() (which calls the same BuildMessage::create/ResolveMessage::create). That path is live via Bun.Transpiler (scanImports/transformSync), web_worker, TOMLObject, JSONCObject, css_internals, and filesystem_router for any input with 2+ log messages. This is pre-existing code the PR doesn't touch, but it's the identical root cause with the identical 3-line fix ([JSValue::ZERO; 256] like src/ast_jsc/lib.rs:50 already does), so it's probably worth applying here too.
Extended reasoning...
What the bug is
This PR fixes a use-after-free in process_fetch_log (src/jsc/VirtualMachine.rs) by replacing a heap Vec<JSValue> with a stack [JSValue; 256], so JSC's conservative stack/register scan keeps freshly-created BuildMessage/ResolveMessage wrappers rooted while later loop iterations call create() (which can trigger GC). The PR's new comment at lines 2561-2568 documents exactly why a heap Vec<JSValue> is unsafe in this pattern.
However, the identical buggy pattern remains in src/jsc/lib.rs:2146-2151, in impl LogJsc for bun_ast::Log → fn to_js:
_ => {
let mut errors_stack: Vec<JSValue> = Vec::with_capacity(count);
for msg in &msgs[0..count] {
errors_stack.push(msg_to_js(msg, global)?);
}
let out = bun_core::ZigString::init(message.as_bytes());
global.create_aggregate_error(&errors_stack, &out)
}The local is even named errors_stack (after the Zig spec's on-stack [256]JSValue), but it is a heap Vec. msg_to_js (lib.rs:2131-2135) dispatches to BuildMessage::create / ResolveMessage::create — the exact same GC-allocating calls that motivated this PR's fix.
Code path that triggers it
LogJsc::to_js is a live path. Verified callers (via use crate::LogJsc + .to_js(...)):
src/runtime/api/JSTranspiler.rs— lines 903, 1029, 1046, 1083, 1398, 1404, 1589, 1595, 1818, 1824 (e.g.transformSync/scanImportson input with multiple parse errors)src/jsc/web_worker.rs:1432, 1680src/runtime/api/TOMLObject.rs:25, 30, 49src/runtime/api/JSONCObject.rs:27src/css_jsc/css_internals.rs:226, 386src/runtime/api/filesystem_router.rs:262, 297, 311, 497, 523
Any of these producing 2+ log messages enters the _ => branch and stages the wrappers in a heap Vec.
Why existing code doesn't prevent it
Quoting this PR's own new comment: "JSC's conservative scan only covers the stack/registers, so a heap Vec<JSValue> would let earlier wrappers be swept (and their native payloads freed) before the AggregateError exists." That reasoning applies verbatim to lib.rs:2146. Until create_aggregate_error runs, the only references to the per-message wrappers live in the Vec's heap buffer; the Vec struct on the stack holds a pointer/len/cap, and conservative scanning does not chase heap pointers.
For contrast, the parallel implementation in src/ast_jsc/lib.rs:49-50 already does this correctly:
// On-stack array: conservative GC stack scan keeps these JSValues alive (see PORTING.md §JSC).
let mut errors_stack: [JSValue; 256] = [JSValue::default(); 256];and the Zig spec at src/ast_jsc/logger_jsc.zig:44 uses var errors_stack: [256]jsc.JSValue. Only the src/jsc/lib.rs port diverged.
Step-by-step proof
Concrete walk-through with new Bun.Transpiler().transformSync(src) where src has 3 parse errors, under BUN_JSC_collectContinuously=true:
- Transpiler hits 3 errors →
log.msgs.len() == 3→JSTranspiler.rscallslog.to_js(global, "..."). to_jsenters the_ =>arm, allocatesVec<JSValue>with capacity 3 — the 24-byte buffer lives on the heap.- Iteration 0:
msg_to_js→BuildMessage::createallocates a JS cell wrapping a 152-byteBox<BuildMessage>payload; the resultingJSValueis written tovec_buf[0]on the heap. No stack slot or register continues to hold it afterpushreturns. - Iteration 1:
msg_to_js→BuildMessage::createallocates again. WithcollectContinuously, this triggers a full GC. The conservative scan walks the native stack and registers; it sees theVec's(ptr, len, cap)triple but does not dereferenceptr, so the wrapper from iteration 0 has zero visible references. It is swept;BuildMessageClass__finalizeruns and frees the 152-byteBox<BuildMessage>. - Iteration 1's wrapper is pushed; iteration 2 likewise (potentially sweeping iteration 1's wrapper).
create_aggregate_error(&errors_stack, …)readserrors_stack[0]— a staleJSValuepointing at a swept cell — and stores it in the AggregateError'serrorsarray.- The caller throws/returns the AggregateError. When user code reads
e.errors[0].message/.position, or when the runtime prints the error, it dereferences the freed 152-byte payload → heap-use-after-free, identical fingerprint to the one this PR fixes.
Impact
Same as the bug this PR fixes: heap UAF reading freed BuildMessage/ResolveMessage payloads when constructing/printing an AggregateError from a multi-error log, reachable from several public APIs (Bun.Transpiler, TOML/JSONC parsing, CSS internals, FileSystemRouter, Workers). It requires a GC to land mid-loop, so it's rare in practice but deterministic under BUN_JSC_collectContinuously=true — exactly the conditions in this PR's new regression test.
How to fix
Apply the same change as this PR:
_ => {
let mut errors_stack = [JSValue::ZERO; 256];
for (i, msg) in msgs[..count].iter().enumerate() {
errors_stack[i] = msg_to_js(msg, global)?;
}
let out = bun_core::ZigString::init(message.as_bytes());
global.create_aggregate_error(&errors_stack[..count], &out)
}This is pre-existing code that the PR does not touch, so I've marked it as such — but since the PR's commit title is "Keep multi-error AggregateError children GC-visible while they are created" and this is a second instance of that exact class with the same 3-line fix, it seems worth folding in here.
What does this PR do?
Fixes a heap-use-after-free found by fuzzing (fingerprint
Address:heap-use-after-free:bun-debug+0x11afa5ac, READ of size 1 at offset 144 of a 152-byte region —BuildMessage.loggedread throughVirtualMachine::print_error_from_maybe_private_datawhile iterating anAggregateError).When a module fails to load with multiple transpiler/resolver errors,
process_fetch_logcreates oneBuildMessage/ResolveMessagewrapper per log message and then wraps them in anAggregateError. The Rust port staged those freshly createdJSValues in a heapVec<JSValue>:Vecare invisible to the GC.BuildMessage::create/ResolveMessage::createallocates JS cells and can trigger a collection, so a GC landing in the middle of the loop could sweep wrappers created in earlier iterations.Box<BuildMessage>payloads, and theAggregateErrorends up holding stale values. Printing it (uncaught exception / unhandled rejection /console.error) or touchingerror.errors[i]afterwards reads the freed payload.The Zig implementation (
VirtualMachine.zigvar errors_stack: [256]JSValue) deliberately kept this staging buffer on the stack so the conservative scan roots the wrappers untilcreateAggregateErrorruns. This PR restores that behavior in the Rust port by using a stack array instead of a heapVec.Reproduced with
BUN_JSC_collectContinuously=true BUN_JSC_forceRAMSize=1000000and a dynamic import of a module with ~250 parse errors: ASAN reports the UAF on the unfixed build (read inprint_error_from_maybe_private_data→Cell<bool>::get, freed byBuildMessageClass__finalize, allocated byprocess_fetch_log→BuildMessage::create); the fixed build is clean.How did you verify your code works?
AggregateError from a module with many build errors survives GC during its creationintest/js/bun/resolve/build-error.test.ts. It imports a module with 250 build errors underBUN_JSC_collectContinuously=true, then inspects and prints the resultingAggregateError. It fails on the unfixed ASAN debug build (AddressSanitizer abort in the subprocess) and passes with this fix.bun bd test test/js/bun/resolve/build-error.test.ts— 3 pass.