strings: bind StringJoiner borrowed segments to a lifetime#31493
Conversation
StringJoiner stored borrowed slices as lifetime-erased raw fat pointers
(RawSlice + owns_slice flag), so nothing tied a pushed segment to the data
it pointed at. Give the joiner a lifetime parameter instead:
- Node becomes enum { Borrowed(&'a [u8]), Owned(Box<[u8]>) }, which drops
the manual Drop impl, the four unsafe Send/Sync impls, and the RawSlice
round-trips; Watcher.input loses its &'static workaround.
- push()/push_static() now borrow segments for the joiner's lifetime, so
the bundler post-processing, sourcemap join, dev-server sourcemap store,
Blob constructor, and FormData serializer call sites are borrow-checked.
- The deferred bundler path (Chunk.intermediate_output has no lifetime
parameter) re-tags the joiner via a new documented unsafe
StringJoiner::detach_lifetime() right before break_output_into_pieces;
Blob/FormData pushes of GC-kept-alive buffers use bun_ptr::detach_lifetime
with SAFETY comments instead of copying.
- postProcessJSChunk's InternalBakeDev arm pushed a block-local
MutableString's bytes by reference; the deferred joiner could read that
buffer after it was freed. Move the bytes into the joiner instead.
- Add unit tests for the node storage rewrite.
|
Updated 9:57 PM PT - May 27th, 2026
✅ @Jarred-Sumner, your commit f12dabcb4899a5b10cbbf7400a1489ceef267d51 passed in 🧪 To try this PR locally: bunx bun-pr 31493That installs a local version of the PR into your bun-31493 --bun |
|
Found 2 issues this PR may fix:
🤖 Generated with Claude Code |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (3)
WalkthroughThis PR parameterizes StringJoiner by lifetime ('a), replaces RawSlice+ownership bits with Node<'a> (Borrowed/Owned), centralizes push bookkeeping, adds unsafe detach_lifetime() to retag borrows as 'static, and propagates the lifetime changes through bundler, Blob/FormData, and a few external APIs with accompanying SAFETY comments and tests. ChangesStringJoiner lifetime parameterization and bundler integration
Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/bun_core/string/StringJoiner.rs`:
- Around line 103-135: Add a direct unit test that constructs a StringJoiner
with at least one borrowed node and a watcher.input (use push or push_static to
create borrowed segments and set watcher.input), then call unsafe {
string_joiner.detach_lifetime() } to produce a StringJoiner<'static> and assert
the detached joiner preserves len, nodes' contents, and watcher fields
(estimated_count and needs_newline) and that borrowed bytes are still readable
(e.g., compare slices/collected output). Make the test exercise Node::Borrowed
paths and watcher.input round-trip through detach_lifetime to validate the
unsafe lifetime-retagging contract for StringJoiner::detach_lifetime.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 79f6478d-acef-4011-906c-61b9c969f700
📒 Files selected for processing (10)
src/bun_core/string/StringJoiner.rssrc/bundler/Chunk.rssrc/bundler/LinkerContext.rssrc/bundler/linker_context/MetafileBuilder.rssrc/bundler/linker_context/postProcessCSSChunk.rssrc/bundler/linker_context/postProcessHTMLChunk.rssrc/bundler/linker_context/postProcessJSChunk.rssrc/runtime/bake/dev_server/source_map_store.rssrc/runtime/webcore/Blob.rssrc/sourcemap/lib.rs
There was a problem hiding this comment.
I didn't find any bugs, but this refactor touches memory-safety invariants across the bundler chunk lifecycle, Blob/FormData GC rooting, and introduces a new unsafe fn detach_lifetime() escape hatch — the per-site SAFETY claims look well-reasoned but deserve a human pass from someone familiar with the chunk/compile-results ownership model.
Extended reasoning...
Overview
This PR rewrites StringJoiner from a lifetime-erased RawSlice<u8> + owns_slice flag (with 4 blanket unsafe impl Send/Sync and a manual Drop) into a lifetime-parameterized enum Node<'a> { Borrowed(&'a [u8]), Owned(Box<[u8]>) }. It threads the new 'a parameter through 10 files: the core type, the bundler's Chunk/LinkerContext/postProcess{JS,CSS,HTML}Chunk/MetafileBuilder, the bake dev-server source-map store, Blob/FormData serialization, and sourcemap::append_source_map_chunk. For the cases where a lifetime can't be expressed (self-referential Chunk.intermediate_output borrowing sibling compile_results_for_chunk; GC-rooted JS buffers), it adds a new unsafe fn StringJoiner::detach_lifetime() and 6 bun_ptr::detach_lifetime slice re-tags, each with a SAFETY comment naming the owner and why it outlives the read. It also fixes a latent UAF in postProcessJSChunk's InternalBakeDev arm (block-local MutableString pushed by reference into a deferred joiner) and adds unit tests for the rewritten node storage.
Security risks
No injection/auth/data-exposure surface. The risk class here is memory safety: every detach_lifetime() call site asserts a non-local invariant about storage lifetime (e.g. "compile_results_for_chunk is filled in place before post-processing and never reassigned afterwards", "the prescan proved no remaining part can run user JS so this buffer stays attached until done()"). These are the same invariants the old code relied on implicitly, now written down — but they're still invariants that a reviewer with bundler-internals knowledge should validate, particularly the cross-thread Chunk.intermediate_output storage and the Blob constructor's parts_can_run_js prescan logic.
Level of scrutiny
High. This is production-critical code on the bundler hot path and in user-facing Blob/FormData construction. The change is a net safety improvement (blanket type-level unsafe impls → narrowly-scoped, documented call-site unsafe), and the bug-hunting system found nothing. But it's a ~300-line refactor of lifetime/ownership reasoning across 10 files, introduces a new unsafe API surface (detach_lifetime), and the correctness of each SAFETY comment depends on non-local invariants about the bundler's worker-pool join ordering and JS GC rooting. The detach_lifetime impl also does into_iter().map().collect() which reallocates the node Vec — likely fine given it runs once per chunk, but worth a glance.
Other factors
The PR description is exceptionally thorough (full unsafe accounting, rationale for why lifetimes alone can't cover the remaining sites, comprehensive test matrix). The UAF fix in the InternalBakeDev arm is a real correctness improvement. No prior reviews on the timeline. Given the scope and the design decision embodied in the detach_lifetime escape hatch, this warrants a human reviewer familiar with the chunk lifecycle rather than bot approval.
…from an array Also tidy a stale mention in a Chunk safety comment.
There was a problem hiding this comment.
Thanks for addressing all the feedback — the Blob ordering fixes and the detach_lifetime test look good. I'm deferring to a human reviewer given the scope: this reworks unsafe lifetime handling across the bundler's deferred-joiner path and Blob/FormData GC-rooted borrows, and ships three Blob-constructor behavior changes alongside the refactor.
Extended reasoning...
Overview
This PR replaces StringJoiner's lifetime-erased RawSlice<u8> + ownership-bit storage with a lifetime-parameterized Node<'a> enum (Borrowed(&'a [u8]) | Owned(Box<[u8]>)), removing four blanket unsafe impl Send/Sync, the manual Drop, and the raw-pointer round-trips. It threads the new lifetime through ~10 call sites across the bundler (Chunk, LinkerContext, four postProcess* files, MetafileBuilder), the bake dev-server sourcemap store, append_source_map_chunk, and Blob/FormData construction. Where the borrow genuinely cannot be expressed (chunk self-reference, GC-rooted JS buffers), it introduces a documented unsafe fn detach_lifetime() plus 10 narrowly-scoped bun_ptr::detach_lifetime re-tags, each with a SAFETY comment naming the owner and lifetime justification.
Along the way it fixes one latent UAF (postProcessJSChunk's InternalBakeDev arm pushed a block-local MutableString's bytes by reference into a joiner stored on the chunk) and — in response to review feedback on this PR — three pre-existing Blob-constructor bugs in the inner array-iteration match (wrong variable + missing continue in the DOMWrapper else branch; LIFO mis-ordering in the _ arm; data loss from break in the nested-array arm). New unit tests cover StringJoiner and a JS test covers the Blob ordering fixes.
Security risks
No new attack surface. The change is a net reduction in unauditable unsafe (blanket Send/Sync impls → per-site documented re-tags), but the remaining detach_lifetime sites in the bundler and Blob/FormData are still soundness-load-bearing: each SAFETY comment encodes a non-local invariant (e.g. "compile_results_for_chunk is filled in place and never reassigned", "prescan proved no remaining part can run user JS", "DOMFormData keeps the entry's Blob alive past joiner.done()"). If any of those invariants drift, the result is a UAF in bundler output assembly or Blob construction. These are exactly the invariants the old code relied on implicitly, so risk is not increased — but verifying each SAFETY justification against the surrounding control flow is the kind of review a human should sign off on.
Level of scrutiny
High. This is unsafe-code restructuring in production-critical paths: bundler chunk assembly (runs on every bun build), Blob/FormData construction (web-facing, GC-interacting), and the HMR dev-server sourcemap path. The Blob fixes also change user-observable behavior (new Blob(["a", {}, "b"]) and nested-array parts now match Node/Chrome ordering), which is correct per spec but is a behavior change shipped alongside the refactor. The PR description's testing section is thorough and the bug-hunting pass found nothing, but the breadth (11 files, 4 subsystems) and the nature (per-site unsafe justifications) put this well outside what I'd auto-approve.
Other factors
All four review threads (one CodeRabbit nitpick, three of mine) were addressed with dedicated commits and tests; nothing is outstanding. CI shows one failure (AsyncLocalStorage-tracking.test.ts segfault on macOS x64) that appears unrelated to this diff but is worth a glance before merge. The detach_lifetime() implementation itself reallocates the nodes Vec via into_iter().collect() rather than transmuting in place — safe and simple, with negligible cost given it runs once per chunk.
StringJoinerstored every borrowed segment as a lifetime-erased raw fat pointer (RawSlice<u8>plus anowns_sliceflag), with blanketunsafe impl Send/Syncon both the joiner and its nodes and a manualDropto free owned slices. Nothing tied a pushed segment to the data it pointed at, so everypush()call site relied on an undocumented "must outlivedone()" convention.What changed
StringJoinerandNodetake a lifetime parameter;Nodeis nowenum { Borrowed(&'a [u8]), Owned(Box<[u8]>) }. This deletes the manualDrop, the fourunsafe impl Send/Sync, theRawSliceround-trips, andWatcher.input's&'staticworkaround.push()/push_static()borrow segments for the joiner's lifetime, so the sourcemap join (append_source_map_chunk), the dev-server sourcemap store, and most bundler/Blob/FormData call sites are now plain borrow-checked code.Chunk.intermediate_output(no lifetime parameter), so each of the fourbreak_output_into_piecescallers (postProcessJSChunk,postProcessCSSChunk,postProcessHTMLChunk,MetafileBuilder::generate) re-tags the joiner with a new documentedunsafe StringJoiner::detach_lifetime()immediately before the call, with a SAFETY comment naming exactly which storage the borrowed nodes point into and why it outlives the stored joiner.bun_ptr::detach_lifetimewith per-site SAFETY comments — including the FormData file-entryData::Bytessite and the top-level typed-array arm, each with its own justification.postProcessJSChunk'sInternalBakeDevarm: it pushed the bytes of a block-localMutableStringby reference into a joiner that is stored inchunk.intermediate_outputand read later. The bytes are now moved into the joiner (take_slice()+push_owned).Net unsafe accounting
Removed: 4 blanket
unsafe impl Send/Sync(which made every push call site unauditable), theunsafefree inNode::drop, theunsafeRawSlice::from_rawinpush_owned, and the lifetime-erasedRawSlicestorage itself.Added: one
unsafe fn detach_lifetime()onStringJoiner, 4detach_lifetime()call sites in the bundler, and 6bun_ptr::detach_lifetimeslice re-tags in Blob/FormData — 10 narrowly-scoped expressions, each with a SAFETY comment naming the owner of the bytes and why it outlives the joiner's last read.The trade is deliberate: instead of two type-level
unsafe impls that silence the compiler for every present and future call site, the remaining unsafety is at the specific points where the borrow genuinely cannot be expressed.Why lifetimes alone can't cover the remaining sites
Chunk.intermediate_outputborrows from the same chunk'scompile_results_for_chunk(and from linker-graph data). A struct field cannot name a lifetime that refers to a sibling field of the same struct, so expressing this with pure lifetimes would mean copying every compiled segment at join time or restructuring chunk storage.DOMFormDataentries) whose liveness over the push→done()span is guaranteed by GC rooting, not by any Rust owner with a nameable lifetime. Borrow-checking those would require cloning each buffer; the prior code already borrowed them, so this keeps the no-copy behavior with the justification written down.Testing
cargo check -p bun_bin,bun run rust:check-all(10/10 targets), debug build smoke test.bundler_edgecase,bundler_npm,bundler_minify,bundler_splitting— all pass.test/bake/dev/esm.test.ts,test/bake/dev/html.test.ts— pass.FormData.test.ts,FormData-multipart-serialization.test.ts,blob.test.ts— pass except two pre-existing debug-build failures (a 5s-timeout GC test and an RSS-threshold test) that fail identically on a main debug build.test/js/bun/sourcemap/,compile-sourcemap-internal.test.ts— pass. Thebun-build-compile*.test.tsstandalone-executable tests time out at 5s when spawning debug-build binaries on this machine; the same tests fail identically on a main debug build (0 new failures).StringJoinerunit tests compile undercargo check;cargo test -p bun_coredoes not link standalone on main either (pre-existing).