-
Notifications
You must be signed in to change notification settings - Fork 4.7k
blob: clamp stat.size to max_size to avoid @intCast panic in ReadFile #29355
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
25bbc81
d3f16f6
fed93fe
31e2d63
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -340,7 +340,7 @@ pub const ReadFile = struct { | |
| } | ||
|
|
||
| this.could_block = !bun.isRegularFile(stat.mode); | ||
| this.total_size = @truncate(@as(SizeType, @intCast(@max(@as(i64, @intCast(stat.size)), 0)))); | ||
| this.total_size = @intCast(@min(@max(stat.size, 0), Blob.max_size)); | ||
|
|
||
| if (stat.size > 0 and !this.could_block) { | ||
| this.size = @min(this.total_size, this.max_length); | ||
|
|
@@ -383,6 +383,7 @@ pub const ReadFile = struct { | |
| if (!this.could_block or (this.size > 0 and this.size != Blob.max_size)) | ||
| this.buffer = std.ArrayListUnmanaged(u8).initCapacity(bun.default_allocator, this.size +| 16) catch |err| { | ||
| this.errno = err; | ||
| this.system_error = bun.sys.Error.fromCode(bun.sys.E.NOMEM, .read).toSystemError(); | ||
| this.onFinish(); | ||
| return; | ||
| }; | ||
|
Comment on lines
383
to
389
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🟡 The new OOM system_error at line 386 is created via Extended reasoning...What the bug is and how it manifests When this.system_error = bun.sys.Error.fromCode(bun.sys.E.NOMEM, .read).toSystemError();
The specific code path that triggers it
Why existing code doesn't prevent it The Addressing the refutation One verifier argued this is intentional because all NOMEM errors in Impact OOM errors are rare, and the PR already improves things significantly (previously OOM silently yielded an empty read with no error at all). Missing path makes OOM slightly harder to diagnose in production but is not a correctness issue. Fix After setting this.system_error = bun.sys.Error.fromCode(bun.sys.E.NOMEM, .read).toSystemError();
this.system_error.?.path = if (this.file_store.pathlike == .path)
bun.String.cloneUTF8(this.file_store.pathlike.path.slice())
else
bun.String.empty;Step-by-step proof
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Keeping this minimal and consistent with the existing |
||
|
|
@@ -656,7 +657,7 @@ pub const ReadFileUV = struct { | |
| this.onFinish(); | ||
| return; | ||
| } | ||
| this.total_size = @truncate(@as(SizeType, @intCast(@max(@as(i64, @intCast(stat.size)), 0)))); | ||
| this.total_size = @intCast(@min(@max(stat.size, 0), Blob.max_size)); | ||
| this.is_regular_file = bun.isRegularFile(stat.mode); | ||
|
|
||
| log("is_regular_file: {}", .{this.is_regular_file}); | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,52 @@ | ||
| import { describe, expect, test } from "bun:test"; | ||
| import { closeSync, openSync } from "fs"; | ||
| import { isWindows, tempDir } from "harness"; | ||
| import { join } from "path"; | ||
|
|
||
| // Reading a Bun.file() backed by a file descriptor goes through | ||
| // ReadFile.runAsync -> getFd (opened_fd already set) -> runAsyncWithFD -> | ||
| // resolveSizeAndLastModified, which derives total_size from fstat. That | ||
| // computation previously used @intCast to u52 guarded by a dead @truncate, | ||
| // so an abnormal fstat size could trip integerOutOfBounds. Triggering that | ||
| // directly requires fstat to report > 4.5 PB which is not achievable here, | ||
| // but these tests lock in the fd-backed ReadFile path that the fuzzer hit. | ||
| describe.skipIf(isWindows)("Bun.file(fd) read", () => { | ||
|
Check warning on line 13 in test/js/bun/util/bun-file-fd-read.test.ts
|
||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🟡 The three tests in Extended reasoning...What the issue is The three tests in The specific code Line 13 of describe.skipIf(isWindows)("Bun.file(fd) read", () => {It should be: describe.concurrent.skipIf(isWindows)("Bun.file(fd) read", () => {Why the fix is trivial and correct Each test creates its own Addressing the refutation The refuter worried that Step-by-step proof
|
||
| async function withFd<T>(path: string, fn: (fd: number) => Promise<T>): Promise<T> { | ||
| const fd = openSync(path, "r"); | ||
| try { | ||
| return await fn(fd); | ||
| } finally { | ||
| closeSync(fd); | ||
| } | ||
| } | ||
|
|
||
| test("text() and arrayBuffer() on a regular-file fd return file contents", async () => { | ||
| using dir = tempDir("bun-file-fd-read", { "fd-read.txt": "hello from fd" }); | ||
| const path = join(String(dir), "fd-read.txt"); | ||
|
|
||
| // Each read needs a fresh fd because Bun.file(fd) does not own or rewind | ||
| // the descriptor, and a completed read leaves it positioned at EOF. | ||
| expect(await withFd(path, fd => Bun.file(fd).text())).toBe("hello from fd"); | ||
|
|
||
| const buf = await withFd(path, fd => Bun.file(fd).arrayBuffer()); | ||
| expect(new Uint8Array(buf)).toEqual(new TextEncoder().encode("hello from fd")); | ||
| }); | ||
|
|
||
| test("slice() with an end beyond the real size reads the actual file contents", async () => { | ||
| using dir = tempDir("bun-file-fd-read", { "fd-slice.txt": "0123456789" }); | ||
| const path = join(String(dir), "fd-slice.txt"); | ||
|
|
||
| // total_size should come from fstat (10), not from the requested slice | ||
| // end, so the initial buffer allocation stays small. | ||
| expect(await withFd(path, fd => Bun.file(fd).slice(0, Number.MAX_SAFE_INTEGER).text())).toBe("0123456789"); | ||
| expect(await withFd(path, fd => Bun.file(fd).slice(2, 5).text())).toBe("234"); | ||
| }); | ||
|
|
||
| test("empty regular file via fd resolves with empty content", async () => { | ||
| using dir = tempDir("bun-file-fd-read", { "fd-empty.txt": "" }); | ||
| const path = join(String(dir), "fd-empty.txt"); | ||
|
|
||
| expect(await withFd(path, fd => Bun.file(fd).text())).toBe(""); | ||
| expect((await withFd(path, fd => Bun.file(fd).arrayBuffer())).byteLength).toBe(0); | ||
| }); | ||
| }); | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🟣 The PR fixes the @truncate/@intcast panic in read_file.zig by clamping stat.size to Blob.max_size, but the analogous resolveFileStat function in Blob.zig (lines ~3298 and 3312) still uses the old @truncate pattern — for a file with stat.size just above 2^52 (e.g. 2^52+1), @truncate silently discards the high bits, yielding max_size=1 instead of Blob.max_size. This is a pre-existing issue in adjacent code that the PR did not touch, but it represents the same class of silent size-corruption bug the PR was explicitly written to address.
Extended reasoning...
What the bug is and how it manifests
In Blob.zig, the resolveFileStat function sets store.data.file.max_size via two expressions that use @truncate to convert a u64 stat.size to SizeType (u52):
Line 3298 (path-based):
store.data.file.max_size = @truncate(@as(u64, @intCast(@max(stat.size, 0))))Line 3312 (fd-based):
store.data.file.max_size = @as(SizeType, @truncate(@as(u64, @intCast(@max(stat.size, 0)))))Unlike the @intcast in read_file.zig (which panics on overflow), @truncate silently drops the high 12 bits when converting u64 to u52. For any stat.size greater than maxInt(u52) = 4503599627370496 (~4.5 PB), the resulting value is not Blob.max_size but rather the low 52 bits of the actual size.
The specific code path that triggers it
A file with stat.size = 2^52 + 1 (i.e., 4503599627370497) would produce @truncate result of 1 (the low 52 bits), setting store.data.file.max_size = 1. Downstream code in resolveSize checks max_size != Blob.max_size and uses it as the store_size, so allocation and routing decisions would use a 1-byte max_size for a multi-petabyte file.
Why existing code doesn't prevent it
@truncate is intentionally not a safety check in Zig — it compiles in release and debug modes without panicking. The only guard is that the high bits are dropped. There is no clamping logic like the PR's @intcast(@min(@max(stat.size, 0), Blob.max_size)) pattern.
Addressing the refutation
One verifier correctly noted that Blob.zig is not modified by this PR, read_file.zig does not call resolveFileStat, and the two code paths share no callers. This is accurate — the bug is genuinely pre-existing and the PR does not increase its surface area. However, the PR's stated purpose is specifically to fix this class of issue (stat.size clamping to Blob.max_size), and the PR author clearly recognized the pattern and applied the fix in read_file.zig. The same fix pattern should be applied to resolveFileStat for completeness. The pre_existing severity designation is appropriate.
What the impact would be
Any file whose stat.size is exactly above a 2^52 multiple (e.g. 2^52+1, 2^53+1, etc.) would get an incorrect small max_size. Files above 4.5 PB are essentially nonexistent on real hardware, making this a theoretical issue for the foreseeable future. But as filesystems and storage technology evolve, and especially in virtualized/mocked environments, this silent truncation is a latent data-corruption bug.
How to fix it
Apply the same clamping pattern used in the PR fix:
Step-by-step proof
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed that
resolveFileStatin Blob.zig has a similar shape, but it uses@truncate(silent wrap, no panic) rather than@intCast, so it's not part of this crash. Keeping this PR scoped to the fuzzer-found panic inReadFile; the Blob.zig clamping can be a separate follow-up if desired.