Skip to content

blob: make StoreRef::data_mut unsafe, drop Sync#30812

Open
robobun wants to merge 19 commits into
mainfrom
farm/4a1dfa50/storeref-not-sync
Open

blob: make StoreRef::data_mut unsafe, drop Sync#30812
robobun wants to merge 19 commits into
mainfrom
farm/4a1dfa50/storeref-not-sync

Conversation

@robobun

@robobun robobun commented May 15, 2026

Copy link
Copy Markdown
Collaborator

Closes #30800.

Bug

StoreRef::data_mut(&self) -> &mut Data minted a &mut from a shared
&self via a raw-pointer deref:

pub fn data_mut(&self) -> &mut Data {
    unsafe { &mut (*self.as_ptr()).data }
}

unsafe impl Sync for StoreRef {}

With Sync, two threads can share a &StoreRef and each call data_mut()
— two &mut Data to the same backing allocation, immediate UB through a
safe API. Even single-threaded, two back-to-back data_mut() calls on the
same &StoreRef alias two &mut Data.

Fix

Four complementary changes:

  1. data_mut is unsafe fn with a precondition of exclusivity (no
    other &/&mut to the pointee live for the borrow's duration, on
    this thread or any other). Every call site (~22) gets a SAFETY note
    describing its single-owner context.
  2. unsafe impl Sync for StoreRef removed. Send is retained: the
    intrusive ThreadSafeRefCount makes moving the handle between threads
    sound, and the threadpool tasks that carry a StoreRef require only
    Send, not Sync.
  3. File.last_modified is now AtomicU64 (closes the race flagged
    in review). do_read_file clones the backing StoreRef into each
    ReadFile task it spawns, so N concurrent file.bytes() calls
    schedule N workers that all observe the same Store allocation.
    Previously the workers all wrote file.last_modified on that shared
    Store concurrently — technically UB even though every writer stores
    the same fstat-derived mtime so the race is idempotent. The
    AtomicU64 + shared-borrow (&store.data) path on worker threads
    closes the observable data race; the JS-thread read in
    get_last_modified uses the same atomic load (Relaxed).
  4. Compile-time trip-wire. A __store_ref_not_sync module uses the
    auto-trait-ambiguity pattern (same trick as
    src/runtime/shell/subproc.rs __pipe_reader_thread_confined): if
    StoreRef ever regains Sync, two blanket NotSyncCheck impls
    apply and _NOT_SYNC fails to compile with "conflicting impls".

Remaining narrow aliasing (documented)

resolve_file_stat (called from JS-thread get_last_modified /
resolve_size / resolved_size only on the first access before a
worker has run) still materializes &mut File to write
max_size/mode/seekable. Two worker-pool &File borrowers can
overlap it: ReadFile tasks, which only touch the AtomicU64
last_modified (the atomic closes that observable race), and POSIX
WriteFile::run_with_fd, which reads the non-atomic seekable/mode
on fd-backed destinations — a pre-existing data-level race this PR
documents but does not close. Both overlaps are benign in practice:
every overlapping writer produces the same stat-derived value
(idempotent) and seekable/mode/max_size are small Copy types
written once per Store lifetime. Converting the rest of File to
interior-mutable fields would close both and is legitimate follow-up
work but out of scope for this soundness-audit PR.

Verification

  • cargo check --workspace — clean, no new warnings.
  • bun bd test test/js/web/fetch/blob*.test.ts test/js/web/streams/readable-stream-blob-consumed.test.ts test/js/bun/globals.test.js
    — 44 blob + 20 globals + 1 streams tests pass under ASAN.
  • New test Bun.file().bytes() is safe under high concurrency fans out
    16 files × 8 concurrent reads, each cloning the backing StoreRef
    into an independent ReadFile task — the primary
    worker-thread-reaches-into-shared-Store path — and asserts every
    task returns the full uncorrupted file bytes under bun bd (ASAN).

Regression guards

Like the sibling soundness PRs (#30773 etc.), this is a Rust-API-soundness
change with no user-visible behavioral delta — the existing call sites
produced identical output before and after (every writer stores the same
idempotent mtime, whether through a plain assignment or a Relaxed atomic
store), so a runtime fail→pass demonstration around the JS-visible API is
not possible. The regression guards are:

  • the compile-time __store_ref_not_sync trip-wire in webcore_types.rs
    (any future unsafe impl Sync for StoreRef fails the build with
    conflicting impls);
  • the unsafe fn data_mut signature (forces new call sites to justify
    their exclusivity claim in writing);
  • test/internal/storeref-not-sync.test.ts, a source-level guard in the
    style of test/internal/dead-code-escapes.test.ts that asserts the
    three invariants on the text of webcore_types.rs (no
    unsafe impl Sync for StoreRef, data_mut spelled pub unsafe fn,
    trip-wire present). All three assertions fail against the pre-fix
    source and pass with this PR, giving a readable suite-level failure
    alongside the trip-wire's rustc error.

Separately, bun bd bootstrap in the gate container trips on a
pre-existing bake-codegen failure related to #30679 (the bootstrap
release-bun is older than the JSON-lexer fix that lets define: accept
raw CSS). A tiny JSON.stringify wrapper on the OVERLAY_CSS value is
included as a standalone commit to unblock bun bd in stale-bootstrap
environments; it's semantically identical to the current form once the
lexer fix propagates but works with both old and new release-bun
snapshots.

Rebase on main (conflict resolution)

Rebased onto main; one conflict in src/runtime/webcore/Body.rs. Main
had refactored the two inlined body-consumer content-type sites into a
shared set_blob_content_type(blob, mime_type, allocated) helper (which
internally calls blob_store_mut). Took main's helper at both call sites
and re-applied this PR's core change — making blob_store_mut an
unsafe fn. set_blob_content_type itself is also now an unsafe fn
with the same exclusivity precondition (a safe wrapper can't discharge a
caller-context SAFETY claim), so its two call sites (Body.rs:1246, 2238)
write unsafe { set_blob_content_type(…) } + SAFETY, matching the two
standalone MimeType::TEXT sites. No behavioral change; body.test.ts
(346 pass), body-clone.test.ts (25 pass) and the concurrency guard all
green under bun bd.

Rebased again after main's Delete PathString (#31579): the
structured-clone deserialization site in Blob.rs
(SerializeTag::Bytes) conflicted because main switched
Bytes.stored_name assignment to name.into_boxed_slice() while this
PR had wrapped the data_mut() there in unsafe + used the now-removed
PathString::init_owned. Kept this PR's unsafe { store.data_mut() } +
SAFETY comment and took main's into_boxed_slice(). Verified with
structured-clone-blob-file.test.ts + blob-file-name-ownership.test.ts
(37 pass under bun bd/ASAN; the ownership test is a 4000-iteration
GC-stress round-trip that needs a generous timeout under ASAN).

@coderabbitai

coderabbitai Bot commented May 15, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

This PR makes StoreRef::data_mut unsafe, removes Sync, replaces File::last_modified with AtomicU64, updates blob/file call sites to use shared borrows or scoped unsafe with atomic stores, and adds a concurrency test; it also stringifies OVERLAY_CSS to avoid older build-cache JSON issues.

Changes

StoreRef Safety and Atomic Timestamp Fix

Layer / File(s) Summary
StoreRef safety contract and File atomic timestamp
src/jsc/webcore_types.rs
StoreRef::data_mut becomes unsafe fn with expanded exclusivity/aliasing preconditions. unsafe impl Sync is removed (keeping Send) with a compile-time trip-wire. File::last_modified changes from JSTimeType to AtomicU64; Default and Clone snapshot via Ordering::Relaxed.
Blob access: shared borrows, scoped unsafe, atomics
src/runtime/webcore/Blob.rs
Read-only paths switch to shared &store.data. Mutation paths snapshot tag/need from shared borrows, release before calling resolve_file_stat, then re-borrow. Sites that mutate bytes or files use tightly-scoped unsafe { store.data_mut() }. last_modified updates use atomic .store(..., Relaxed).
ReadFile and FileReader borrow-safety changes
src/runtime/webcore/FileReader.rs, src/runtime/webcore/blob/read_file.rs
FileReader::on_start clones File locally for Lazy::open_file_blob to avoid mutable aliasing. ReadFile::resolve_size_and_last_modified and ReadFileUV::on_file_initial_stat use shared &store.data and update last_modified via atomic .store(..., Relaxed).
CopyFileWindows unsafe borrowing for fd preparation
src/runtime/webcore/blob/copy_file.rs
prepare_read_write_loop wraps destination and source file_store.data_mut() in unsafe blocks with SAFETY comments, passing .as_file_mut().pathlike into prepare_pathlike while preserving existing error handling and mkdirp behavior.
Bytes-backed mutations via scoped unsafe
src/runtime/webcore/Blob.rs (multiple sites)
shared_view_raw, structured-clone deserialization, Blob__setAsFile, jsdom_file_construct_, S3BlobDownloadTask::on_s3_download_resolved, and Any::to_internal_blob_if_possible switch to narrowly-scoped unsafe { store.data_mut() } for stored_name/bytes installs.
resolve_file_stat and write_file_internal atomics
src/runtime/webcore/Blob.rs
resolve_file_stat obtains mutable File via unsafe { store.data_mut() }.as_file_mut() and updates last_modified using atomic store(stat_to_js_mtime(...), Relaxed). write_file_internal resets last_modified via atomic store.
Concurrent file bytes access safety test
test/js/web/fetch/blob-write.test.ts
New regression test Bun.file().bytes() is safe under high concurrency creates 16 temp files and runs 8 overlapping file.bytes() reads per file, asserting decoded bytes match expected content.
OVERLAY_CSS JSON stringification for build-cache compatibility
src/codegen/bake-codegen.ts
OVERLAY_CSS in the Bun build define mapping is wrapped with JSON.stringify() to avoid JSON lexing/build-cache failures in older Bun revisions when raw CSS begins with unquoted *{…}.

Suggested reviewers

  • Jarred-Sumner
  • RiskyMH
  • dylan-conway
🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title "blob: make StoreRef::data_mut unsafe, drop Sync" directly and clearly describes the main changes: making data_mut unsafe and removing the Sync implementation.
Linked Issues check ✅ Passed The PR fully addresses all coding requirements from #30800: (1) makes data_mut unsafe with exclusivity precondition and ~22 call site SAFETY notes; (2) removes unsafe impl Sync while retaining Send; (3) converts File.last_modified to AtomicU64 with Relaxed atomic operations; (4) adds compile-time trip-wire to prevent Sync reintroduction.
Out of Scope Changes check ✅ Passed All code changes align with the PR objectives: core soundness fixes (StoreRef/File.last_modified), test additions for concurrency safety, and a minimal bake-codegen JSON.stringify wrapper to unblock bootstrap—no unrelated modifications detected.
Description check ✅ Passed The pull request description is comprehensive and well-structured, covering the bug, fix, verification, and regression guards.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@robobun

robobun commented May 15, 2026

Copy link
Copy Markdown
Collaborator Author

@github-actions

Copy link
Copy Markdown
Contributor

This PR may be a duplicate of:

  1. fix(StoreRef): remove unsound Sync impl #30805 - Also closes StoreRef Sync is unsound: data_mut(&self) enables mutable aliasing via safe API #30800 by removing the unsound Sync impl on StoreRef due to data_mut(&self) aliasing

🤖 Generated with Claude Code

Comment thread src/runtime/webcore/blob/read_file.rs Outdated
Comment thread src/runtime/webcore/blob/copy_file.rs Outdated
Comment thread src/jsc/webcore_types.rs Outdated
Comment thread src/runtime/webcore/Blob.rs Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/runtime/webcore/blob/copy_file.rs (1)

1376-1407: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Remove the unnecessary data_mut() borrows from this setup path.

Lines 1385 and 1405 mint &mut Data only to pass pathlike into prepare_pathlike(), but that helper never mutates the PathOrFileDescriptor. On a shared StoreRef, this unnecessary mutable borrow reopens the aliasing hazard the PR is trying to eliminate.

♻️ Suggested fix
     fn prepare_pathlike(
-        pathlike: &mut PathOrFileDescriptor,
+        pathlike: &PathOrFileDescriptor,
         must_close: &mut bool,
         is_reading: bool,
     ) -> bun_sys::Result<Fd> {
         if let PathOrFileDescriptor::Path(path) = pathlike {
             let fd = match bun_sys::openat_windows_a(

     self.read_write_loop.destination_fd = match Self::prepare_pathlike(
-            &mut unsafe { self.destination_file_store.data_mut() }
-                .as_file_mut()
+            &self.destination_file_store
+                .data
+                .as_file()
                 .pathlike,
             &mut self.read_write_loop.must_close_destination_fd,
             false,
         ) {

     self.read_write_loop.source_fd = match Self::prepare_pathlike(
-            &mut unsafe { self.source_file_store.data_mut() }
-                .as_file_mut()
+            &self.source_file_store
+                .data
+                .as_file()
                 .pathlike,
             &mut self.read_write_loop.must_close_source_fd,
             true,
         ) {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/runtime/webcore/blob/copy_file.rs` around lines 1376 - 1407, The mutable
borrows from destination_file_store.data_mut() and source_file_store.data_mut()
are unnecessary and reintroduce aliasing risk; change these to immutable borrows
(use the non-mutating accessor, e.g., data()/as_file() instead of
data_mut()/as_file_mut()) when extracting .pathlike and call
Self::prepare_pathlike(...) with that shared reference while preserving the
mutable out-parameter read_write_loop.must_close_destination_fd (and the source
equivalent); update both assignments to self.read_write_loop.destination_fd and
self.read_write_loop.source_fd to use the immutable accessors and keep error
handling/throw/mkdirp logic unchanged for CopyFileWindows and prepare_pathlike.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/runtime/webcore/FileReader.rs`:
- Around line 360-367: The SAFETY comment is insufficient to justify calling
unsafe { store.data_mut() } because data_mut() requires proving no &Data/&mut
Data borrows exist on any thread (worker threads may hold &Data via
StoreRef/Blob clones); fix by removing the unsafe store.data_mut() usage in
on_start and instead clone the File/Blob instance and mutate the cloned File
(take &mut file on the cloned File) so you never call store.data_mut() on the
shared Store; update open_file_blob, ReadFile spawn points, and the on_start
path to operate on the cloned File (or otherwise ensure tasks that spawn
ReadFile run after on_start) to eliminate the aliasing risk and make the SAFETY
reasoning trivial for store.data_mut().

---

Outside diff comments:
In `@src/runtime/webcore/blob/copy_file.rs`:
- Around line 1376-1407: The mutable borrows from
destination_file_store.data_mut() and source_file_store.data_mut() are
unnecessary and reintroduce aliasing risk; change these to immutable borrows
(use the non-mutating accessor, e.g., data()/as_file() instead of
data_mut()/as_file_mut()) when extracting .pathlike and call
Self::prepare_pathlike(...) with that shared reference while preserving the
mutable out-parameter read_write_loop.must_close_destination_fd (and the source
equivalent); update both assignments to self.read_write_loop.destination_fd and
self.read_write_loop.source_fd to use the immutable accessors and keep error
handling/throw/mkdirp logic unchanged for CopyFileWindows and prepare_pathlike.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 53bfa435-5921-4484-a1bd-2bd2e8e97d10

📥 Commits

Reviewing files that changed from the base of the PR and between e4523d1 and 82e4873.

📒 Files selected for processing (7)
  • src/codegen/bake-codegen.ts
  • src/jsc/webcore_types.rs
  • src/runtime/webcore/Blob.rs
  • src/runtime/webcore/FileReader.rs
  • src/runtime/webcore/blob/copy_file.rs
  • src/runtime/webcore/blob/read_file.rs
  • test/js/web/fetch/blob-write.test.ts

Comment thread src/runtime/webcore/FileReader.rs Outdated
Comment thread src/jsc/webcore_types.rs Outdated
Comment thread src/runtime/webcore/FileReader.rs
@robobun

robobun commented May 15, 2026

Copy link
Copy Markdown
Collaborator Author

CI status

Build #61299 (sha b8db52c, the latest push) finished with 17 failed test lanes. Every one of them failed on test/cli/install/bunx.test.ts:518 "should handle package that requires node 24", which runs bunx --bun @angular/cli@latest --help against the live npm registry and currently exits 3 on every platform. It is pipeline-wide, not specific to this PR:

  • failed on every lane of this branch in builds #61295, #61296, and #61299, four attempts each
  • fails identically on unrelated branches, e.g. build #61287 (farm/6d382bbf/openineditor-plain-spawn) and #61291

So every build will be red until current @angular/cli@latest state is handled or the test is quarantined; nothing in this PR touches bunx or install.

Beyond that, two single-lane one-offs, each passing on all 16 other lanes at this sha:

  • test/js/web/streams/streams-leak.test.ts "reuses the pull buffer across small reads" on one alpine x64-baseline shard: chunks.length 13 vs the > 20 threshold. Pre-existing test (from NativeReadableStreamSource: reuse pull-buffer tail across reads instead of rotating every pull #30593) counting chunks of a fetch response over a live socket; its own comment notes "some chunks coalesce on the wire". This PR does not touch the fetch/network pull path (its streaming changes are in the file-blob reader).
  • vendor/elysia/test/bun/router.test.ts "handle trace url" on darwin 14 aarch64: a trace hook's captured url was "" when asserted immediately after the fetch, an async trace-dispatch race in the vendored Elysia suite. This PR has no serve/router/trace changes.

No failure involves a file this PR changes; the webcore/StoreRef shards, including the new guards (test/internal/storeref-not-sync.test.ts, test/js/web/fetch/blob-write.test.ts), are green everywhere. The one CI retrigger allowed by policy was already spent, so handing off for a maintainer merge.

@robobun robobun force-pushed the farm/4a1dfa50/storeref-not-sync branch from 958ecea to 66617cf Compare May 23, 2026 12:12

@claude claude Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No new issues found, but this is a substantial Rust soundness audit (unsafe-fn migration across ~22 data_mut/blob_store_mut call sites, !Sync drop, AtomicU64 field conversion, plus an unrelated bake-codegen bootstrap workaround) that's gone through several rounds of correction — worth a human pass before merge. There's also one minor unaddressed inline note on FileReader.rs:382.

Extended reasoning...

Overview

This PR closes #30800 by making StoreRef::data_mut an unsafe fn, removing unsafe impl Sync for StoreRef, converting File.last_modified from JSTimeType (u64) to AtomicU64, and adding a compile-time !Sync trip-wire. It touches 8 files: src/jsc/webcore_types.rs (core type definitions), src/runtime/webcore/{Blob,Body,FileReader}.rs, src/runtime/webcore/blob/{copy_file,read_file}.rs, a new concurrency stress test, and an unrelated src/codegen/bake-codegen.ts bootstrap workaround. Roughly 22 call sites are wrapped in unsafe { } with per-site SAFETY justifications, and several read paths were refactored from data_mut() to shared &store.data borrows.

Security risks

None in the conventional sense — no auth, crypto, network, or input-parsing surface. The change is a Rust memory-model soundness audit. The risk profile is correctness of unsafe-code reasoning: each SAFETY comment is a claim about aliasing/threading that, if wrong, masks UB. Several such claims were already corrected during review (false "sole holder" / "JS thread blocked" justifications, the blob_store_mut sibling back-door, the last_modified data race), and the PR description itself documents a remaining "benign" &mut File vs &File overlap in resolve_file_stat as out-of-scope follow-up.

Level of scrutiny

High. This is production-critical unsafe Rust in the core Blob/Store types that back Bun.file(), fetch bodies, FileReader, and S3. The PR went through four rounds of substantive correction (three from the bug-hunting system, one from CodeRabbit), each of which found real inaccuracies in the SAFETY reasoning. The FileReader::on_start refactor introduces a small behavioral divergence from the Zig original (dropped is_atty write-back to the shared Store) that the author has not yet acknowledged in the comment or addressed. The bundled bake-codegen.ts change is logically unrelated to the soundness fix and should at minimum be called out to the human reviewer.

Other factors

  • One unresolved 🟡 inline comment from a prior run (FileReader.rs:382) flags the dropped is_atty cache write-back; impact is negligible but the PR's "no user-visible behavioral delta" claim should account for it.
  • CI build #54800 has one failing lane (html-rewriter-leak.test.ts on debian-13-x64-asan) that the author argues is a known flake unrelated to this diff; that determination should be confirmed by a maintainer.
  • The AtomicU64 conversion is type-correct (JSTimeType = u64 in Rust per src/jsc/lib.rs:2299) and the new Clone for File snapshots it via Relaxed load, which is sound.
  • Test coverage: a new 16×8 concurrency stress test exercises the worker-thread ReadFile path under ASAN; existing blob/streams/globals suites pass per the description.

Given the complexity, the iterative correction history, the documented residual aliasing, and the bundled unrelated codegen change, this warrants human sign-off rather than auto-approval.

Comment thread src/jsc/webcore_types.rs Outdated
Comment thread src/runtime/webcore/FileReader.rs Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/jsc/webcore_types.rs`:
- Around line 1238-1253: The unsafe Send impl on StoreRef is relying on a
runtime convention because Data::S3 contains Option<Rc<S3Credentials>> which is
not Send; file a follow-up issue to make this sound by replacing
Rc<S3Credentials> with Arc<S3Credentials>, update all uses (including
S3::init_with_referenced_credentials and any places that clone or drop
S3Credentials) to use Arc cloning semantics, remove the runtime-only caveat from
the unsafe impl once conversion is done, and add tests exercising cross-thread
StoreRef drop paths to ensure no Rc->Arc regressions.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: fa7076cb-0b7b-4904-b52a-2a8eb0ab51f3

📥 Commits

Reviewing files that changed from the base of the PR and between 66617cf and 7b3b1d6.

📒 Files selected for processing (4)
  • src/jsc/webcore_types.rs
  • src/runtime/webcore/Blob.rs
  • src/runtime/webcore/FileReader.rs
  • test/js/web/fetch/blob-write.test.ts

Comment thread src/jsc/webcore_types.rs
Comment thread src/codegen/bake-codegen.ts
Comment thread src/runtime/webcore/Blob.rs Outdated
Comment thread src/jsc/webcore_types.rs Outdated

@claude claude Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All prior review feedback has been addressed and I found nothing further on fbf4013, but this is a substantial unsafe-Rust soundness change to core StoreRef/Blob threading semantics (Send/Sync impls, AtomicU64 field conversion, ~22 unsafe call-site justifications across 10 files) that warrants a human sign-off.

Extended reasoning...

Overview

This PR closes #30800 by making StoreRef::data_mut an unsafe fn, removing unsafe impl Sync for StoreRef, converting File::last_modified from JSTimeType to AtomicU64, and adding a compile-time !Sync trip-wire. It touches src/jsc/webcore_types.rs (the canonical Store/StoreRef/File definitions), Blob.rs, Body.rs, FileReader.rs, copy_file.rs, read_file.rs, plus a new concurrency test, two stale-comment fixups in json_lexer.rs/bun-build-api.test.ts, and a bootstrap workaround in bake-codegen.ts.

Security risks

None in the conventional sense (no auth/crypto/input-handling). The risk surface is memory-safety: this PR rewrites the Send/Sync story for an intrusively-refcounted heap type that crosses the JS↔worker-pool thread boundary, converts a field to atomic, and restructures borrow discipline at ~22 data_mut call sites plus 4 blob_store_mut sites. An error here could introduce UB (aliased &mut, torn reads) rather than a logic bug.

Level of scrutiny

High. This is exactly the category the approval guidelines flag: complex, large, touches critical code paths, and centers on unsafe Rust semantics. The PR went through seven rounds of inline review from me (false SAFETY claims, the last_modified worker-thread race, the blob_store_mut sibling hole, the dropped is_atty write-back, the Rc<S3Credentials> Send caveat, stale cross-references), all of which were addressed — but the volume of subtle corrections itself indicates this needs a human maintainer's eyes on the final state, particularly the documented-but-unfixed resolve_file_stat &mut File vs worker-thread &File overlap and the FileReader::on_start behavior change (dropped is_atty cache write-back).

Other factors

  • A sibling PR #30805 was flagged as a potential duplicate also closing #30800; a maintainer should reconcile which lands.
  • The bake-codegen.ts JSON.stringify wrapper is an unrelated bootstrap workaround riding along in this PR.
  • CI on the prior push had one known-flaky failure (html-rewriter-leak.test.ts) unrelated to this diff.
  • The bug hunting system found nothing on the current revision (fbf4013).

@robobun robobun force-pushed the farm/4a1dfa50/storeref-not-sync branch from fbf4013 to 0e8c2f6 Compare May 27, 2026 23:20
Comment thread src/runtime/webcore/Body.rs Outdated
@robobun robobun force-pushed the farm/4a1dfa50/storeref-not-sync branch from ea7da7d to 9243c97 Compare June 2, 2026 05:42

@claude claude Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No remaining issues from the automated pass, but this is a substantial Rust soundness change (unsafe-fn boundary moves, !Sync, AtomicU64 field, ~22 SAFETY justifications, an acknowledged &mut File/&File overlap in resolve_file_stat, and an intentional behavioral divergence in FileReader::on_start) that warrants a human sign-off.

Extended reasoning...

Overview

This PR closes #30800 by making StoreRef::data_mut an unsafe fn, removing unsafe impl Sync for StoreRef, converting File::last_modified to AtomicU64, and adding a compile-time !Sync trip-wire. It touches 10 files: the core webcore_types.rs type definitions, ~22 data_mut call sites across Blob.rs/Body.rs/FileReader.rs/read_file.rs/copy_file.rs (each wrapped in unsafe + SAFETY or refactored to shared borrows), the sibling blob_store_mut/set_blob_content_type helpers in Body.rs (also made unsafe fn), a new concurrency stress test, and a small bake-codegen.ts bootstrap workaround plus two stale-comment cleanups.

Security risks

None in the conventional sense — no auth, crypto, network input parsing, or data-exposure surfaces are touched. The risk class here is memory-safety / UB: the PR rearranges &mut-from-&self boundaries on an intrusively-refcounted heap type that crosses threads, and explicitly leaves one acknowledged compiler-level &mut File vs worker-thread &File overlap in resolve_file_stat (documented as idempotent/benign, with full interior-mutability conversion deferred as follow-up). It also documents-but-does-not-fix the pre-existing Rc<S3Credentials> !Send gap under unsafe impl Send for StoreRef.

Level of scrutiny

High. This is a soundness audit of unsafe Rust touching threading, atomics, and aliasing rules in a core webcore type. The PR went through ~8 rounds of review feedback, each surfacing subtle inaccuracies in SAFETY justifications (handle-uniqueness vs borrow-exclusivity conflation, wrong threading-model claims, stale cross-references). All were addressed and resolved, but the iteration count itself demonstrates the subtlety. There is also one intentional behavioral change: FileReader::on_start now clones File and discards open_file_blob's is_atty write-back (diverging from the Zig original), which the PR documents as a deliberate trade-off.

Other factors

  • All prior inline comments (mine and CodeRabbit's) are resolved; the bug-hunting system found nothing new on this revision.
  • A potential-duplicate bot flagged #30805 as also closing #30800.
  • The PR description itself flags maintainer-level decisions: the remaining resolve_file_stat aliasing, the Blob: Sync follow-up, and the Rc→Arc S3 credentials follow-up.
  • CI history includes one unrelated flake (html-rewriter-leak.test.ts) on a single lane.

Given the scope, the unsafe-code surface, and the explicit design trade-offs, this should not be auto-approved.

@claude claude Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All my prior feedback has been addressed and I found nothing new this pass — but this is a substantial unsafe-Rust soundness audit (drops Sync, rewrites ~22 data_mut call sites' aliasing contracts, converts File.last_modified to AtomicU64, and leaves a documented-but-unclosed &mut File&File overlap in resolve_file_stat), so it warrants a human sign-off.

Extended reasoning...

Overview

This PR closes #30800 by hardening the StoreRef mutable-access contract: data_mut becomes unsafe fn, unsafe impl Sync for StoreRef is removed (with a compile-time trip-wire), File.last_modified becomes AtomicU64 to close a real cross-thread race on the worker-pool ReadFile path, and ~22 call sites across Blob.rs / Body.rs / FileReader.rs / read_file.rs / copy_file.rs are rewritten to either (a) use shared &store.data borrows where mutation isn't needed, or (b) wrap data_mut() in narrowly-scoped unsafe with per-site SAFETY justifications. Two ancillary changes ride along: a JSON.stringify wrapper on OVERLAY_CSS in bake-codegen.ts (bootstrap workaround) and a stale-comment cleanup in json_lexer.rs.

Security risks

None identified. This is an internal Rust soundness/aliasing audit with no user-facing API surface, no auth/crypto/permissions, and no untrusted-input handling changes. The bake-codegen.ts change is a build-time string-quoting tweak.

Level of scrutiny

High. The change is large (~9 files), touches production-critical webcore Blob/Store/file-I/O paths that run on both the JS thread and worker-pool threads, and consists almost entirely of unsafe-block edits whose correctness depends on subtle threading/aliasing invariants. The PR has gone through ~8 rounds of review feedback (mine and CodeRabbit's) correcting SAFETY-comment factual claims — handle-uniqueness vs borrow-exclusivity conflations, wrong thread-model citations, the Rc<S3Credentials> !Send caveat, the blob_store_mut/set_blob_content_type safe back-doors — all now addressed. The remaining resolve_file_stat &mut File vs worker-thread &File overlap is explicitly documented as out-of-scope follow-up rather than closed.

Other factors

  • All 12 of my prior inline comments are resolved; the most recent commits (9243c979, 26a4daf9, 90ee114c, 90d446ac, 23982cdd) directly address them.
  • A sibling PR #30805 also closes #30800 — a maintainer should decide which lands.
  • CI history shows one flaky-test failure (html-rewriter-leak.test.ts) unrelated to this diff.
  • The PR's own description flags two acknowledged loose ends for follow-up: Blob: Sync tightening and Rc<S3Credentials>Arc.
  • No CODEOWNERS hit that I'm aware of, but the scope (core webcore unsafe Rust + cross-thread invariants) is well outside what I'd approve without a maintainer look.

robobun and others added 8 commits June 7, 2026 13:37
`StoreRef::data_mut(&self) -> &mut Data` produced an aliasing `&mut` from a
shared `&self` through a raw-pointer deref while `unsafe impl Sync for
StoreRef` permitted `&StoreRef` to cross threads. Two threads sharing a
`&StoreRef` could each call `data_mut()` and hand back `&mut Data` to the
same backing allocation — immediate UB through a safe API.

Every in-tree call site only ever calls `data_mut` from the thread that
currently owns the handle (moved into a `ReadFile`/`CopyFile`/`WriteFile`
task or accessed on the JS thread), so the hole is latent, but the API
shape carries no compile-time guarantee.

Fix:

  - `data_mut` is now `unsafe fn` with a precondition of exclusivity
    ("no other `&`/`&mut` to the pointee is live for the returned
    borrow's duration, on this or any other thread"). 22 call sites
    updated with SAFETY notes describing their single-owner context.
  - `unsafe impl Sync for StoreRef` removed. `Send` is retained —
    `ThreadSafeRefCount` already makes move-between-threads sound, and
    `ReadFile`/`CopyFile`/`WriteFile` rely on `Send`, not `Sync`.
  - A compile-time trip-wire (`__store_ref_not_sync`) fails to compile
    if `StoreRef` ever regains `Sync` — same pattern as
    `src/runtime/shell/subproc.rs`.

Closes #30800
…n bootstrap

The `define: { OVERLAY_CSS: '*{...}' }` form relies on the JSON lexer
fallback added in #30679 to tolerate CSS operators (`*`, `(`, `)`) and
auto-quote the value as a string literal. Bootstrap `bun bd` in
environments whose prebuilt release bun predates that fix still trips on
`defines.json:1:1 — Operators are not allowed in JSON`, breaking the
gate container and any contributor on an older nightly.

Wrap `css(...)` in `JSON.stringify` so the value parses as a real JSON
string literal directly. No behavioral change to the bundled runtime
(the auto-quote fallback and explicit quoting produce identical output);
same pattern as the neighbouring `side: JSON.stringify(side)`.
claude[bot] review flagged four real issues on the first iteration:

1. `read_file.rs:670-674` SAFETY claims "sole holder" and "JS thread is
   blocked on the promise" — both false. `do_read_file` (Blob.rs:456)
   clones the `StoreRef` into each `ReadFile` task, so N concurrent
   `file.bytes()` calls race on `file.last_modified`. The race is
   pre-existing but the PR's express purpose is to make each call site
   defensible, so fix it: convert `File::last_modified` to `AtomicU64`
   and have the worker (and all writers) use `store(Relaxed)` through
   a shared `&Data` borrow. Every writer stores the same
   `fstat`-derived mtime, so the race is idempotent; the atomic closes
   the observable data race under Rust's memory model.

2. `copy_file.rs:1376-1378` SAFETY copy-pasted from the POSIX
   `CopyFile` context — this is `CopyFileWindows`, which runs on the
   JS event-loop thread via libuv callbacks (not a worker), and the
   JS thread is not blocked. Conclusion is still correct, but the
   reasoning is wrong: rewrite to cite JS-thread synchronous execution
   with no JS re-entry.

3. `webcore_types.rs:1218` doc claimed `!Sync` "is what makes the
   single-owner contract enforceable", but `Blob: Sync` plus
   `Blob::store(&self) -> Option<&StoreRef>` projects a sharable
   `&StoreRef` from a `&Blob` shared across threads — and
   `StoreRef: Clone + Send` lets every worker-pool task clone its
   own handle. Correct the doc: `!Sync` is a partial guard that
   catches the most direct misuse; the `unsafe fn` precondition is
   the load-bearing compile-time gate, and each caller must discharge
   it in writing.

4. `Blob.rs:4178/5462` and `FileReader.rs:360` SAFETY comments all
   conflated *handle* uniqueness ("`&mut Blob` so no concurrent
   access" / "sole holder" / "sole live handle") with *borrow*
   exclusivity (the real `data_mut` precondition). `StoreRef`s are
   intrusively refcounted; the `Store` routinely has multiple live
   handles. Rewrite each to the actual invariant: synchronous
   JS-thread call with no JS re-entry, so no other `&Data`/`&mut
   Data` borrow is live for the match scope.

Drive-by cleanups from the atomic refactor: `get_last_modified`,
`resolve_size`, `resolved_size`, `write_format`, `Blob__setAsFile`
reset, and the ReadFileUV Windows writer all switch to shared `&Data`
via `StoreRef: Deref` where the operation no longer needs `&mut` —
reducing the remaining `data_mut` call sites to the ones that
genuinely write non-atomic fields. Test comment rewritten to reflect
that all 8 tasks per file share one `Store` (not their own
allocation) and the atomic is what makes the shared-`last_modified`
idempotent-race safe under Rust's memory model.
…er paths

Two more review findings from the second coderabbit pass.

1. `copy_file.rs` `prepare_pathlike` only reads `pathlike` (to call
   `slice()` and `fd()`); it never mutates it. Both call sites in
   `prepare_read_write_loop` were minting `&mut Data` via
   `data_mut()` purely to satisfy the `&mut PathOrFileDescriptor`
   parameter, which reopened the aliasing hazard the PR is trying to
   close. Narrow the parameter to `&PathOrFileDescriptor` and route
   the call sites through `store.data.as_file().pathlike` (shared
   borrow via `StoreRef: Deref<Target = Store>`). The `unsafe`
   block on the Windows copy path is gone.

2. `FileReader::on_start` used `data_mut()` to reach `Data::File`
   so it could pass `&mut File` to `Lazy::open_file_blob`, which
   only writes `file.is_atty`. The `Store` is routinely aliased
   by the originating JS `Blob` (`Lazy::Blob` is populated via
   `store.clone()` in `ReadableStream::from_blob_copy_ref`), so
   the SAFETY comment could not justify `data_mut`'s exclusivity
   precondition without reasoning about task-spawn ordering.
   Refactor: match on `&store.data`, `file.clone()` out a local,
   let `open_file_blob` mutate the clone. The clone is cheap —
   `PathLike` bumps an intrusive refcount, `last_modified` is an
   `AtomicU64` snapshot, everything else is `Copy`. The existing
   fallthrough already `drop(store)`s right after
   `open_file_blob`, so persisting `is_atty` on the shared
   `Store` wouldn't have propagated anywhere useful either way.
claude[bot] flagged a sibling safe fn `Body.rs::blob_store_mut` that
carries the same `&mut`-from-`&self` soundness shape as pre-PR
`data_mut`: it mints `&mut Store` from `&Blob` via
`StoreRef::as_ptr()`, bypassing the `data_mut` precondition entirely.
Pre-existing but squarely within this PR's audit scope.

Promote `blob_store_mut` to `unsafe fn` with the same exclusivity
precondition as `data_mut`. Replace the stale "same invariant
`StoreRef::data_mut` already documents" doc-cross-reference with a
proper `# Safety` block. Wrap all 4 call sites (Body.rs:1246, 1276,
2260, 2281) in `unsafe` with SAFETY comments citing the real
invariant: synchronous JS-thread continuation of a body consumer,
no JS re-entry, no other `&Data`/`&Store` borrow of this `Store`
is live.

Also amend `data_mut`'s doc to stop claiming to be the *sole*
compile-time guard — `blob_store_mut` is a sibling guard carrying
the identical contract via `StoreRef::as_ptr()`.
Previous build failed on unrelated `v8-heap-snapshot.test.ts` SIGKILL
(Ubuntu 25.04 x64 test lane) — my diff only touches
Blob/StoreRef/FileReader/CopyFileWindows/Body, no path through the V8
heap snapshot machinery. One-shot re-roll per CI-failures policy.
robobun added 5 commits June 7, 2026 13:41
claude[bot] flagged an unacknowledged behavioral divergence from the
Zig original at FileReader.rs:382: cloning `File` locally before
handing it to `open_file_blob` means `file.is_atty = Some(true)` no
longer propagates to the shared `Store`. This is intentional — writing
back would require `data_mut()` on a handle that may be aliased by
the originating JS `Blob` and other holders, re-opening the exact
soundness hole this PR closes. Expanded the PORT NOTE to call that out
explicitly, cite the canonical `Bun.std{in,out,err}` path
(`__bun_stdio_blob_store_new` constructs with `is_atty` already
populated, so no regression there), and spell out the narrow
`Bun.write(Bun.file(0), ...)` after `.stream()` case where the
optimized copy strategy falls back to the plain one.
Two comment-only follow-ups from the latest review pass:

* src/jsc/webcore_types.rs: `unsafe impl Send for StoreRef` SAFETY now
  carries a CAVEAT for the `Data::S3` variant. `S3::credentials` is
  `Option<Rc<S3Credentials>>` (non-atomic refcount) and is typically
  shared with the JS-thread `S3Client` via `init_with_referenced_credentials`,
  so a worker-thread drop of an S3-backed StoreRef would race the JS
  thread's `Rc`. In tree today no caller routes a `Data::S3` store
  through the worker pool — the S3 I/O paths (`S3File.rs`,
  `FetchTasklet.rs`) stay on the JS thread, and worker-pool
  `ReadFile`/`CopyFile`/`WriteFile` only touch `Data::File`/`Data::Bytes`.
  If that changes, swap the refcount to `Arc`.

* FileReader.rs / Blob.rs / blob-write.test.ts: three comments
  referenced "the PR" / "PR review" without an identifier, which
  src/CLAUDE.md:26 forbids. Replaced with concrete `#30800` references
  (and dropped the test-file parenthetical entirely).
All three are pure comment-accuracy fixes, no code change:

* src/jsc/webcore_types.rs:1250 — the `Data::S3` Send CAVEAT cited
  `src/runtime/webcore/fetch/s3.rs`, a file that does not exist
  (`src/runtime/webcore/fetch/` only contains `FetchTasklet.rs`).
  The actual S3 I/O code is under `src/runtime/webcore/S3File.rs` and
  `src/runtime/webcore/s3/`; replaced the citation.

* src/runtime/webcore/Blob.rs:6237-6239 — `resolve_file_stat` kept a
  pre-existing lead-in ("the caller holds the only ref across this
  call, so an exclusive borrow is sound") that directly contradicts
  the new SAFETY block appended immediately below it ("worker-thread
  `ReadFile` tasks can concurrently hold a shared `&Data`"). The new
  SAFETY block is accurate; the stale lead-in is the same
  handle-uniqueness ↔ borrow-exclusivity conflation that this PR
  already fixed at three sibling sites in 8003ab0. Deleted it.

* src/codegen/bake-codegen.ts no longer passes raw CSS as a `define:`
  value (it wraps in JSON.stringify to survive pre-#30679 lexing), so
  two in-tree references to the old behaviour are now stale:
  - test/bundler/bun-build-api.test.ts:76 ("src/codegen/bake-codegen.ts
    passes verbatim as OVERLAY_CSS")
  - src/parsers/json_lexer.rs:1292 ("(bake-codegen.ts's OVERLAY_CSS)")
  Both now cite the generic user-reachable `Bun.build({ define: })`
  case without naming bake-codegen as the example consumer. The
  auto-quote path itself and its test coverage are unchanged.
The rebase onto main introduced `set_blob_content_type` (from an
upstream refactor) as a safe `fn` that internally calls
`unsafe { blob_store_mut(blob) }` with SAFETY comments citing a
property of its *call sites* ("synchronous JS-thread body-consumer
continuation") rather than anything the helper itself enforces — the
same safe-back-door shape that 9013607 closed for `blob_store_mut`.

A safe wrapper of an unsafe fn must discharge the precondition
independently of its callers; this helper cannot. Made it `unsafe fn`
with the same `# Safety` precondition as `blob_store_mut`, moved the
caller-context SAFETY justification to the two module-private call
sites (Body.rs:1246, 2238), and reduced the in-body comments to
"precondition (see fn doc)". Matches the treatment already applied to
`blob_store_mut` and the two standalone `MimeType::TEXT` sites.

No behavioral change. `body.test.ts` / `body-clone.test.ts` /
`blob-write.test.ts` all green (382 pass) under `bun bd`.
This PR only needed a one-line comment scrub in bun-build-api.test.ts
(dropping a now-stale `bake-codegen.ts passes verbatim as OVERLAY_CSS`
reference). That file also contains pre-existing RSS/leak threshold
tests ("sourcemap inline leak", "called thousands of times") that are
flaky under a loaded ASAN environment and have nothing to do with this
soundness fix. Touching the file pulled those flaky tests into scope.

Revert the comment edit so the file drops out of this PR's diff. The
load-bearing half of that review — the `json_lexer.rs` comment no longer
naming bake-codegen as the auto-quote consumer — stays. The test-file
comment's bake-codegen mention is a soft illustrative reference and the
lexer's own doc is accurate.
@robobun robobun force-pushed the farm/4a1dfa50/storeref-not-sync branch from 23982cd to f378b91 Compare June 7, 2026 13:44
test/internal/dead-code-escapes.test.ts enforces a per-file budget of
item-level #[allow(dead_code)] escapes and the __store_ref_not_sync
trip-wire bumped src/jsc/webcore_types.rs from 0 to 1, failing the
debian-13-x64-asan lane.

Match the reference pattern in src/runtime/shell/subproc.rs
(__pipe_reader_thread_confined) exactly: underscore-prefix the helper
trait (_NotSyncCheck) so rustc's dead-code lint exempts it, and drop
the allow attribute. The const was already underscore-prefixed; the
lint error was on the trait ("trait NotSyncCheck is never used" —
associated-const projection inside an unused const doesn't count as a
use). Verified with cargo check -p bun_jsc (-D dead-code clean) and
the inventory test (28 pass).
Comment thread src/runtime/webcore/Blob.rs Outdated
Comment thread src/jsc/webcore_types.rs Outdated
The cross-thread aliasing note only enumerated ReadFile workers and
concluded the remaining &mut File overlap was "compiler-level only".
POSIX WriteFile::run_with_fd (write_file.rs) is a second worker-pool
&File borrower: it reads the non-atomic `seekable`/`mode` fields on
fd-backed destinations through a StoreRef::clone of the same Store
(`destination_blob.borrowed_view()`), so the JS-thread `&mut File` can
race those reads at the data level. The race is pre-existing and benign
(small Copy types, idempotent stat-derived values, gated on
`pathlike.is_fd()`), but the note now names it instead of claiming
workers never touch those fields. Amended the matching sentence in the
PR description.

Also dropped the drifted `webcore_types.rs:96` self-referential line
citation from the !Sync comment (the Blob Sync impl moved during
rebases); the prose already names the impl and accessor signature,
which is grep-sufficient and immune to line drift.
Comment thread src/runtime/webcore/FileReader.rs Outdated
robobun added 2 commits June 7, 2026 14:45
The note claimed the Zig `defer { deref; lazy = none }` was "hoisted
after the match below since both arms fall through", but `drop(store)`
runs before the match and the `Err` arm returns early. The comment
described an earlier draft's structure; the line below it already
carries the accurate mapping (deref happens at `drop(store)`,
`lazy = None` at the `replace` above).
The #30800 fix is a type-level property (unsafe fn data_mut, !Sync,
the __store_ref_not_sync trip-wire) with no runtime-observable delta:
the aliasing UB it closes does not corrupt memory on supported targets
and the concurrent last_modified writes are idempotent, so no test/js
runtime test can fail on the unfixed build.

Follow the test/internal/dead-code-escapes.test.ts pattern instead:
assert the invariants on the source text of webcore_types.rs. The three
checks (no `unsafe impl Sync for StoreRef`, `data_mut` spelled
`pub unsafe fn`, trip-wire module present) all fail against the
pre-fix source and pass with the fix, and give a readable suite-level
failure message complementing the trip-wire's compile-time
conflicting-impls error.
Comment thread test/internal/storeref-not-sync.test.ts
Comment thread src/runtime/webcore/FileReader.rs
Comment thread src/runtime/webcore/FileReader.rs Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

StoreRef Sync is unsound: data_mut(&self) enables mutable aliasing via safe API

1 participant