blob: avoid size overflow when allocating ReadFile buffer#29008
blob: avoid size overflow when allocating ReadFile buffer#29008robobun wants to merge 7 commits into
Conversation
|
Updated 1:34 PM PT - Apr 8th, 2026
❌ @robobun, your commit bba2e5d has 3 failures in
🧪 To try this PR locally: bunx bun-pr 29008That installs a local version of the PR into your bun-29008 --bun |
WalkthroughAdjusted size handling and buffer allocation in the blob file reader: stat.size is clamped with Changes
🚥 Pre-merge checks | ✅ 2✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
There was a problem hiding this comment.
LGTM — the guard reordering and saturating add correctly fix the sentinel overflow.
Extended reasoning...
Overview
The PR modifies two lines in src/bun.js/webcore/blob/read_file.zig and adds a regression test. The change tightens the allocation guard in ReadFile.runAsyncWithFD so that this.size == Blob.max_size (the unknown-size sentinel) can no longer slip through to this.size + 16, and adds a saturating add (+|) as belt-and-braces.
Security risks
None. This is a crash fix for an integer overflow that causes SIGILL in debug builds. No auth, crypto, or permission-sensitive code is touched.
Level of scrutiny
Low. The diff is 2 logic lines plus a test. The fix is mechanically correct: reordering the boolean short-circuit to check size \!= max_size first prevents the sentinel from reaching the arithmetic, and +| prevents overflow even if the guard were somehow bypassed. Easy to verify by inspection.
Other factors
Two bugs were reported by the hunting system. The @intCast issue (pre-existing, unrelated to this PR's changes) is a valid separate concern that predates the PR and does not affect the correctness of this fix. The test weakness (nit — a normal file can never leave size as the sentinel) is accurate but does not undermine the safety of the Zig change itself; the fix is correct regardless of test coverage. No outstanding reviewer comments exist in the timeline.
|
|
||
| it("reading a Bun.file whose size is still the max_size sentinel doesn't crash", async () => { | ||
| const filename = tmpdir() + "/bun.test.max-size-sentinel.txt"; | ||
| await Bun.write(filename, "hello world"); | ||
| const file = Bun.file(filename); | ||
| // Do not touch file.size — leave the Blob.size as the max_size sentinel. | ||
| // This hits the ReadFile.runAsyncWithFD path that previously could | ||
| // integer-overflow `this.size + 16` when resolveSize left the size as | ||
| // the sentinel for certain fd cases. | ||
| expect(await file.text()).toBe("hello world"); | ||
| }); |
There was a problem hiding this comment.
🟡 The new regression test (, lines 12–22) does not exercise the sentinel overflow path it claims to guard against — it passes identically against the old unfixed code, providing zero regression protection. For the 11-byte file written by the test, sets (not the sentinel), so the old guard evaluates true and allocates bytes with no overflow. To meaningfully cover the overflow scenario the test would need a code path where remains after stat resolution, which requires a ~4 PB file or a special fd type. The Zig fix itself is correct; only the test is misleading.
Extended reasoning...
What the test claims vs what it actually tests
The test comment says it guards against the integer overflow of this.size + 16 in ReadFile.runAsyncWithFD when the size remains the Blob.max_size sentinel. The overflow is real — Blob.max_size = std.math.maxInt(u52) = 2^52-1 ≈ 4.5 PB, and this.size +| 16 in a u52 without saturating semantics would trap in debug mode at that value.
Why the test does not trigger the overflow path
The overflow is only possible when ReadFile.size == Blob.max_size after resolveSizeAndLastModified runs. Looking at that function, for a regular file with stat.size > 0:
this.size = @min(this.total_size, this.max_length);For the 11-byte test file: this.total_size = 11, this.max_length = Blob.max_size (sentinel from the Blob), so this.size = @min(11, Blob.max_size) = 11. The sentinel is never reached.
Step-by-step proof that the test passes against the old (unfixed) code
Old guard: if (\!this.could_block or (this.size > 0 and this.size \!= Blob.max_size))
For the test's 11-byte regular file after resolveSizeAndLastModified:
this.could_block = false(regular file)this.size = 11\!this.could_block = true→ entire OR is TRUE → condition is TRUE- Allocates
initCapacity(11 + 16)= 27 bytes — no overflow, no crash
New guard: if (this.size \!= Blob.max_size and (\!this.could_block or this.size > 0))
this.size = 11 \!= Blob.max_size→ true\!this.could_block = true→ condition is TRUE- Allocates
initCapacity(11 +| 16)= 27 bytes — same result
The test produces identical behavior on old and new code.
What a meaningful regression test would require
To exercise the overflow path, ReadFile.size must equal Blob.max_size after stat. The only way this can happen for regular files is @min(total_size, max_length) == Blob.max_size, which requires stat.size >= 2^52-1 ≈ 4 PB — not feasible in a test. For non-regular files (pipes, special fds) when stat.size == 0 and could_block, the code sets this.size = if (this.max_length == Blob.max_size) 4096 else this.max_length — also not Blob.max_size. Triggering the original crash likely required an unusual fd whose stat reported a very large size.
Severity and impact
The underlying Zig fix (reordering the guard to check size \!= Blob.max_size first, plus using saturating add +|) is correct and sound. The test itself is not wrong — it still provides basic smoke coverage for reading a file. However, the comment is actively misleading about what path is exercised, and the test provides no protection against a future regression of the specific overflow bug. Severity: nit.
171046a to
96e9f93
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/js/bun/util/bun-file-read.test.ts`:
- Around line 13-22: The test currently uses tmpdir() + filename for temp file
creation; switch to using tempDir from "harness" with its using helper to create
an isolated temporary directory and place the file there to ensure automatic
cleanup and test isolation; update the test to obtain a tempDir via tempDir(...)
(the harness helper) and write/read the file using Bun.write and Bun.file as
before (keep the existing assertions calling file.text()) so only the temp
directory creation/cleanup mechanism changes.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 6d9785a3-b7e4-4b03-9158-109237e9d810
📥 Commits
Reviewing files that changed from the base of the PR and between 1bde3c51d9ded402c9923b5c5fb9d509466bbbfc and 96e9f9329028b4dede54a75adb0036031091c703.
📒 Files selected for processing (2)
src/bun.js/webcore/blob/read_file.zigtest/js/bun/util/bun-file-read.test.ts
| it("reading a Bun.file without touching .size first does not crash", async () => { | ||
| // Smoke test for the ReadFile.runAsyncWithFD path: reading a Bun.file | ||
| // without resolving its size first previously could hit an integer | ||
| // overflow at `this.size + 16` and a checked `@intCast` in | ||
| // `resolveSizeAndLastModified` for files whose stat size overflows u52. | ||
| const filename = tmpdir() + "/bun.test.max-size-sentinel.txt"; | ||
| await Bun.write(filename, "hello world"); | ||
| const file = Bun.file(filename); | ||
| expect(await file.text()).toBe("hello world"); | ||
| }); |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Good regression test for the overflow fix.
The test correctly exercises the ReadFile.runAsyncWithFD path by calling .text() without first accessing .size. The inline comment clearly documents the regression scenario.
Consider using tempDir from "harness" with using for automatic cleanup and test isolation, though this follows the existing pattern in the file:
♻️ Optional: Use tempDir from harness
+import { tempDir } from "harness";
import { expect, it } from "bun:test";
-import { tmpdir } from "node:os";
it("reading a Bun.file without touching .size first does not crash", async () => {
- const filename = tmpdir() + "/bun.test.max-size-sentinel.txt";
+ using dir = tempDir();
+ const filename = dir + "/bun.test.max-size-sentinel.txt";
await Bun.write(filename, "hello world");
const file = Bun.file(filename);
expect(await file.text()).toBe("hello world");
});As per coding guidelines: "Use tempDir from "harness" to create temporary directories".
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| it("reading a Bun.file without touching .size first does not crash", async () => { | |
| // Smoke test for the ReadFile.runAsyncWithFD path: reading a Bun.file | |
| // without resolving its size first previously could hit an integer | |
| // overflow at `this.size + 16` and a checked `@intCast` in | |
| // `resolveSizeAndLastModified` for files whose stat size overflows u52. | |
| const filename = tmpdir() + "/bun.test.max-size-sentinel.txt"; | |
| await Bun.write(filename, "hello world"); | |
| const file = Bun.file(filename); | |
| expect(await file.text()).toBe("hello world"); | |
| }); | |
| import { tempDir } from "harness"; | |
| import { expect, it } from "bun:test"; | |
| it("reading a Bun.file without touching .size first does not crash", async () => { | |
| // Smoke test for the ReadFile.runAsyncWithFD path: reading a Bun.file | |
| // without resolving its size first previously could hit an integer | |
| // overflow at `this.size + 16` and a checked `@intCast` in | |
| // `resolveSizeAndLastModified` for files whose stat size overflows u52. | |
| using dir = tempDir(); | |
| const filename = dir + "/bun.test.max-size-sentinel.txt"; | |
| await Bun.write(filename, "hello world"); | |
| const file = Bun.file(filename); | |
| expect(await file.text()).toBe("hello world"); | |
| }); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/js/bun/util/bun-file-read.test.ts` around lines 13 - 22, The test
currently uses tmpdir() + filename for temp file creation; switch to using
tempDir from "harness" with its using helper to create an isolated temporary
directory and place the file there to ensure automatic cleanup and test
isolation; update the test to obtain a tempDir via tempDir(...) (the harness
helper) and write/read the file using Bun.write and Bun.file as before (keep the
existing assertions calling file.text()) so only the temp directory
creation/cleanup mechanism changes.
96e9f93 to
219b19e
Compare
There was a problem hiding this comment.
LGTM — the binary artifacts are gone, the widening is correct, and the allocation guard properly blocks the sentinel path before .
Extended reasoning...
Overview
The PR modifies two files: (core crash fix) and (regression test). Three changes land in the Zig file: (1) the stat-size cast chain is rewritten to widen before truncating to , eliminating the previously-noted panic for files with ; (2) the heap pre-allocation guard is reordered so is checked first, preventing the sentinel from reaching the expression; (3) an explicit cast ensures the addition is in pointer-width arithmetic.
Security risks
No security-sensitive code paths are touched. The change affects only local file I/O buffer allocation logic. There is no auth, crypto, or permission code involved.
Level of scrutiny
Low-to-medium. This is a targeted crash fix for a well-scoped arithmetic overflow. The sentinel guard reordering is mechanically correct (short-circuit evaluation ensures the overflow expression is never reached when size equals max_size). The cast widening follows the safe pattern already used elsewhere in the codebase.
Other factors
Previous concerns from my prior reviews have been addressed: the binary ELF test artifacts are no longer in the diff, and the widening issue is now fixed in this very PR. The only remaining nit — using from instead of from — was already noted in inline comments by both myself and coderabbitai and is a minor style issue that does not block correctness.
There was a problem hiding this comment.
LGTM — all prior feedback addressed.
Extended reasoning...
Overview
This PR fixes a Fuzzilli-found SIGILL in ReadFile.runAsyncWithFD where this.size + 16 could trap in debug builds when this.size == Blob.max_size (the unknown-size sentinel). Changes touch src/bun.js/webcore/blob/read_file.zig and a new test in test/js/bun/util/bun-file-read.test.ts.
Security risks
No security risks. This is a defensive fix: it prevents an integer overflow crash and tightens a size guard on heap allocation. No auth, crypto, or permissions code touched.
Level of scrutiny
Low-to-medium. The Zig changes are surgical and well-understood: reordering a boolean guard, using saturating add +|, widening i64→u64 before truncating to u52. All three approaches are standard Zig idioms for safe integer handling. Prior reviews raised binary artifacts, a pre-existing @intCast overflow, and a tempDir convention violation — all of which were addressed in follow-up commits.
Other factors
The one remaining open inline comment (test not perfectly exercising the sentinel path) was already characterized as a nit in the original comment — it's a comment accuracy issue, not a code correctness issue. The underlying fix is sound regardless of what the test exercises.
There was a problem hiding this comment.
LGTM — previous issues (accidental binary artifacts, tmpdir vs tempDir) addressed; the Zig fix is correct.
Extended reasoning...
Overview
Targeted fix in src/bun.js/webcore/blob/read_file.zig for a Fuzzilli-found SIGILL. Two changes: (1) resolveSizeAndLastModified and ReadFileUV.onFileInitialStat now compute total_size via @truncate(@as(u64, ...)) — widening i64→u64 first, then truncating to u52, eliminating the checked-cast panic for sizes above max(u52); (2) the allocation guard is reordered so size \!= Blob.max_size is evaluated before the allocation, preventing this.size + 16 from wrapping. The test adds basic smoke coverage and now correctly uses tempDir from harness.
Security risks
None. The fix closes a crash path, not a security boundary. No auth, crypto, or permission code is touched.
Level of scrutiny
Low. The change is a narrow arithmetic-safety fix with a clear causal chain from the Fuzzilli report. Both modified files have no CODEOWNER entries in the standard sense; the logic is self-contained.
Other factors
The two actionable issues I raised in earlier review rounds — accidentally committed ELF binaries and use of node:os tmpdir instead of the harness helper — are both resolved in the current revision. The remaining observations (test not exercising the exact 4-PB sentinel path, pre-existing @intCast on a separate code path) are either fundamentally untestable or pre-existing; neither blocks this fix.
| if (this.size != Blob.max_size and (!this.could_block or this.size > 0)) | ||
| this.buffer = std.ArrayListUnmanaged(u8).initCapacity(bun.default_allocator, this.size + 16) catch |err| { | ||
| this.errno = err; | ||
| this.onFinish(); |
There was a problem hiding this comment.
🟣 Pre-existing bug: when fails with at line 387, the catch block sets but never sets . Because exclusively checks to decide whether an error occurred, the success branch fires and delivers an empty to the caller — so resolves to an empty string instead of rejecting with an OOM error.
Extended reasoning...
What the bug is and how it manifests
In ReadFile.runAsyncWithFD (read_file.zig lines 387–391), when std.ArrayListUnmanaged(u8).initCapacity fails with OutOfMemory, the catch block is:
this.buffer = std.ArrayListUnmanaged(u8).initCapacity(bun.default_allocator, this.size + 16) catch |err| {
this.errno = err;
this.onFinish();
return;
};This sets this.errno but never sets this.system_error. this.buffer remains the zero-initialized .{} (items = []).
The specific code path that triggers it
The then() function (line ~265) exclusively checks this.system_error to determine whether to reject:
const system_error = this.system_error;
// ...
if (system_error) |err| {
cb(cb_ctx, ReadFileResultType{ .err = err });
return;
}
cb(cb_ctx, .{ .result = .{ .buf = buf, .total_size = total_size, .is_temporary = true } });It never checks this.errno. Since system_error is null, then() takes the success branch and calls cb with buf = [] — an empty buffer.
Why existing code doesn't prevent it
Every other error path that populates this.errno also sets this.system_error. The ReadFileUV counterpart sets both fields in every OOM catch block (e.g. lines 702–704, 731–733, 764–766, 791–793). The resolveSizeAndLastModified fstat-failure path (lines 315–318) also sets both. This is the only spot in the POSIX ReadFile path where errno is set without a matching system_error assignment.
What the impact would be
Under memory pressure (containers with strict limits, very large files), Bun.file(path).text() silently resolves to an empty string "" rather than rejecting. Callers have no way to distinguish a genuine empty file from a failed read.
Step-by-step proof
- Caller invokes
Bun.file(path).text()on a large file with the system near its memory limit. resolveSizeAndLastModifiedsetsthis.sizeto e.g. 1 GB.- Guard condition
this.size \!= Blob.max_sizeis true →initCapacity(1_000_000_016)is attempted. - Allocator returns
error.OutOfMemory; catch block executes:this.errno = error.OutOfMemory; this.onFinish(); return; this.system_errorremainsnull;this.buffer.itemsremains[].then()runs:const system_error = this.system_error;→null.if (system_error) |err|is false → success branch fires.- Caller receives
{ .result = .{ .buf = [], .total_size = ..., .is_temporary = true } }→ empty string.
How to fix it
Add this.system_error = bun.sys.Error.fromCode(bun.sys.E.NOMEM, .read).toSystemError(); in the catch block, mirroring the ReadFileUV pattern and the resolveSizeAndLastModified fstat-failure pattern. This is pre-existing and not introduced by this PR — the PR only changes the guard condition at line 386.
|
The fix at commit e85531b is: ```zig
``` Two changes:
This follows the same |
|
Real CI status on commit e85531b (my widening fix):
The widening pattern |
There was a problem hiding this comment.
LGTM — the u52 widening fix and guard reorder are both correct, and the prior issues are addressed.
Extended reasoning...
Overview
The PR modifies src/bun.js/webcore/blob/read_file.zig to fix a SIGILL in ReadFile.runAsyncWithFD triggered by Fuzzilli. Two changes are made: (1) the allocation guard is reordered so this.size \!= Blob.max_size is checked before dereferencing the size, and (2) the arithmetic this.size + 16 is widened to @as(usize, @intCast(this.size)) + 16 to prevent u52 overflow for values in [max_size-15, max_size-1]. A smoke test is added using tempDir from harness.
Security risks
No security-sensitive code paths (auth, crypto, permissions) are touched. The fix closes an integer overflow that could cause a crash in debug/ReleaseSafe builds — a stability issue, not a security escalation.
Level of scrutiny
Low-to-medium. The change is narrow and mechanical: one guard reorder and one type widening at a single allocation site. The pre-existing issues I flagged (the @intCast in resolveSizeAndLastModified and the OOM-does-not-set-system_error path) are pre-existing and out of scope for this fix; they are noted in inline comments for follow-up.
Other factors
The binary artifact issue I raised earlier is not present in the final diff. The u52 near-sentinel overflow I flagged in my 17:26 review is addressed by the @as(usize, @intCast(this.size)) + 16 widening in commit e85531b. The test now correctly uses tempDir from harness. No outstanding concerns remain on the in-scope changes.
|
Superseded by #29284, which fixes the root cause (caps the initial buffer for non-regular files at 4096 so the read actually succeeds instead of rejecting with ENOMEM) and also covers the Windows |
What does this PR do?
Fuzzilli hit a SIGILL in
ReadFile.runAsyncWithFDatthis.size + 16(src/bun.js/webcore/blob/read_file.zig:387).this.sizeis au52(Blob.SizeType) andBlob.max_sizeis the sentinel for an unknown size. The previous allocation guard allowed the regular-file branch (!this.could_block) to fall through even whenthis.size == Blob.max_size, sothis.size + 16could trap in debug builds.The fix tightens the guard so we don't allocate when the size is still the sentinel, and switches to a saturating add (
+|) as a belt-and-braces protection against any remaining near-overflow edge cases. The OOM path is still exercised via the existingcatch |err|.Fingerprint:
6c4f7e5905a658b6How did you verify your code works?
test/js/bun/util/bun-file-read.test.tsthat reads aBun.filewithout first touching.size, exercising themax_sizesentinel path throughReadFile.runAsyncWithFD.bun bd test test/js/bun/util/bun-file-read.test.ts(debug + ASAN) — both tests pass.bun bd test test/js/web/fetch/blob.test.tsandbun bd test test/js/bun/util/bun-file.test.ts— all existing blob/file tests still pass.