Skip to content

fetch: encode FetchTasklet's cross-thread ownership in the type system#31745

Open
alii wants to merge 20 commits into
mainfrom
ali/fetch-lifetime-clarity
Open

fetch: encode FetchTasklet's cross-thread ownership in the type system#31745
alii wants to merge 20 commits into
mainfrom
ali/fetch-lifetime-clarity

Conversation

@alii

@alii alii commented Jun 3, 2026

Copy link
Copy Markdown
Member

What does this PR do?

Follow-up to the review discussion on #31676, where the concern was that "the memory lifetime of all the fetch stuff is very confusing right now." This is a behavior-preserving refactor of FetchTasklet that moves the cross-thread ownership rules out of comments and into the type system, so changes like #31676's buffer handoff can be reviewed by reading one small struct instead of reconstructing the locking discipline from comments scattered across 2,400 lines.

1. The mutex now owns the data it guards. The free-floating mutex: Mutex whose protected set was documented only in comments ("Mutex guards buffer against the HTTP thread") becomes shared: Guarded<HttpHandoff>, reusing the existing bun_threading::Guarded. The cross-thread state — result, metadata, scheduled_response_buffer, body_size — is unreachable without holding the lock. Lock windows are unchanged: every guard drop sits exactly where a manual unlock() used to (the on_progress_update cleanup closure takes the guard by value so unlock-first ordering is preserved on every return path).

2. The struct is partitioned by thread role. JS-thread-only state (promise, sink, response refs, abort plumbing, the waiting flags) moved into a named JsState; HTTP→JS handoff state into HttpHandoff; what remains directly on FetchTasklet is lock-free shared state (atomics) and the leases handed to the HTTP thread (response_buffer, the AsyncHTTP box). All fields are now private. A small Parts<'t> split-borrow struct lets lock-held helpers keep access to JS-side state without aliasing &mut self — it compiles to nothing.

3. The refcount protocol is named. ref_count now starts at init_exact_refs(2) — one baseline ref per thread, matching ThreadSafeStreamBuffer — instead of init() plus a bare ref_() later in queue(). Each release site states which ref it drops: release_js_ref, release_http_ref, release_sink_ref, release_drain_task_ref, and the struct doc carries the full ref table plus the lock invariants (user JS can run while the lock is held — checkServerIdentity, microtask drains, sink re-entry — which is why ignore_data is now an atomic: the GC-finalizer path must stay lock-free). release_http_ref debug-asserts the lock is not held by the releasing thread, since the final release frees the allocation the mutex lives in.

4. The buffer handoff is two named methods. HttpHandoff::merge_result (sticky one-shot can_stream, metadata-accepted-once, certificate-info carry-over) and HttpHandoff::stage_response_bytes (copy + reset for buffer reuse) replace the inline field dance in callback. The stored result no longer carries its body alias — the alias is asserted on the incoming result and then Noned, so the 'static-widened copy holds no live borrow and is_http2 is mirrored into an atomic for the one lock-free reader (skip_chunked_framing).

Two support changes outside the fetch module:

  • GuardedLock is now \!Send (like std::sync::MutexGuard) and also \!Sync (stricter than MutexGuard, which is Sync where T: Sync; nothing shares a guard across threads) — Darwin os_unfair_lock / Windows SRWLOCK must be unlocked on the locking thread.
  • jsc::Weak::create_ptr is a new unsafe fn (with a documented safety contract) for the one caller that cannot form &mut while split borrows are live; the safe create now wraps it.

Behavior

One deliberate change, found by writing the ref table; everything else is preserved by construction and by test.

Fixed: process.exit() during a streaming upload leaked the whole in-flight chain (the unbalanced-sink-ref hole CodeRabbit flagged on the first revision). release_at_shutdown only balanced the two baseline refs; a still-attached request-body sink's ref and any resume_request_data_stream node parked in the JS concurrent queue (whose ManagedTask tag release_queued_tasks_for_shutdown cannot release) kept the count above zero, so the FetchTaskletBox<AsyncHTTP> ⇄ sink ⇄ ThreadSafeStreamBuffer chain was never reclaimed — LSan reports all of it with BUN_DESTRUCT_VM_ON_EXIT=1. Now sink_ref_held / queued_drain_tasks record those refs and release_at_shutdown balances them (same JS-thread-parked argument as the existing has_schedule_callback read), and dealloc_in_flight_for_exit releases the HTTP-side stream-buffer ref the way InternalState::reset does on the normal path.

  • New coverage: test/js/web/fetch/fetch-exit-in-flight.test.tsprocess.exit() with requests still in flight (dealloc_in_flight_for_exitrelease_at_shutdown) had no coverage at all. Three child-process fixtures establish a known in-flight state (mid-download with body pending, pre-metadata, mid streaming upload with the sink attached), then exit with LeakSanitizer enabled (ASAN builds; inert elsewhere): an unbalanced ref fails as exit 23 with the leaked chain in the report, a UAF as a different exit code, a hung exit as the test timeout. The mid-upload test fails without this PR's shutdown fix and passes with it; the other two pin the already-balanced paths.
  • No other behavior changes: the existing suite is the pin; no existing test files are modified.
  • fetch.test.ts, fetch-gzip.test.ts, fetch.stream.test.ts, body.test.ts, body-stream.test.ts, fetch-abort-queued.test.ts, abort-signal-leak.test.ts on the patched debug build: 9,914 pass, 2 fail — and both failures reproduce identically on a main debug build in the same environment (one helper-server ConnectionRefused, one GC-timing 5s timeout), verified by rebuilding main into the same build dir and re-running the identical suite. Main fails a strict superset locally.
  • No new allocations, copies, or lock operations: Guarded<T> stores the same mutex and data that were already adjacent fields, the handoff methods are the same code moved, and the split-borrow struct is borrows only. No benchmarks were run since no hot-path code changed shape.

Review guide

The commits are staged to be read in order — the big one (partition FetchTasklet into JsState and HttpHandoff) is rename-only by construction; the behavior-relevant ones (start FetchTasklet refcount at 2, convert mutex to Guarded, make ignore_data atomic) are each small and isolated.

alii added 7 commits June 2, 2026 16:24
Group the JS-thread-only fields into a JsState struct and the four
mutex-guarded handoff fields (result, metadata, scheduled_response_buffer,
body_size) into HttpHandoff, so each field's thread affinity is visible at
the access site (self.js.* vs self.shared.*). Move the eight JS-side
helpers onto JsState and name the callback's merge/staging steps as
HttpHandoff::merge_result / stage_response_bytes. The mutex and every
lock/unlock site are untouched; all FetchTasklet fields are now private.
No behavior change.
…t borrows

Replace the free-floating mutex + manually paired lock/unlock calls with
bun_threading::Guarded<HttpHandoff>, so the guarded fields can only be
reached through a lock guard and every early return unlocks via RAII at
the exact same points as before.

- Lock-held helpers (start_request_stream, on_body_received,
  check_server_identity, on_reject, to_body_value, to_response,
  on_resolve) now take (&mut Parts, &mut HttpHandoff) instead of
  &mut self, where Parts is a disjoint-field borrow bundle split off
  before taking the lock; bodies are unchanged respellings.
- on_progress_update's cleanup closure consumes the guard by value so
  each of its call sites unconditionally unlocks first, exactly where
  the manual unlock was.
- callback() and the start-streaming drain use explicit drop(guard) at
  each exit, at the same program points as the old unlocks.
- clear_data/release_at_shutdown use Guarded::get_mut (no new lock) for
  their existing exclusive-access teardown writes.
- jsc::Weak gains create_ptr (raw-ctx variant of create) for on_resolve,
  which can no longer form &mut FetchTasklet while field borrows live.

Lock count, critical-section contents, and allocation placement are
unchanged.
@robobun

robobun commented Jun 3, 2026

Copy link
Copy Markdown
Collaborator

@coderabbitai

coderabbitai Bot commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Splits FetchTasklet into JS-thread JsState and guarded HttpHandoff, adds a non-Send GuardedLock marker, introduces explicit release_*_ref helpers, rewrites progress/body/abort/shutdown and HTTP callback flows for the new layout, and adds exit-during-in-flight tests.

Changes

FetchTasklet Architectural Refactoring

Layer / File(s) Summary
GuardedLock non-Send marker for thread-correct unlocking
src/threading/guarded.rs
Adds PhantomData<*const ()> field to GuardedLock and updates try_lock/lock constructors to include the marker so the guard remains non-Send/non-Sync.
Weak reference unsafe constructor refactor
src/jsc/Weak.rs
Adds pub unsafe fn create_ptr(...) with safety docs; Weak::create now forwards via NonNull and WeakImpl::init uses ctx.cast::<c_void>().
FetchTasklet state split: JsState and HttpHandoff struct definitions
src/runtime/webcore/fetch/FetchTasklet.rs
Introduces JsState (JS-thread-only) and HttpHandoff (HTTP↔JS shared behind Guarded), updates imports to include Guarded, and refactors FetchTasklet layout and get() initialization.
split(), stream buffer, and ref-pointer helpers
src/runtime/webcore/fetch/FetchTasklet.rs
Adds split() for disjoint borrows, moves request-body stream buffer access into JsState, and centralizes refcount pointer helpers.
Ref-release wrappers and shutdown behavior
src/runtime/webcore/fetch/FetchTasklet.rs, src/runtime/dispatch.rs, src/http/HTTPThread.rs
Adds release_js_ref, release_http_ref, release_sink_ref, release_drain_task_ref; release_at_shutdown clears shared buffers and releases refs; dispatch shutdown drains exit-time streaming refs then calls release_js_ref; HTTPThread dealloc path calls original_request_body.deinit() before drop.
Cleanup and mutation paths
src/runtime/webcore/fetch/FetchTasklet.rs
Refactors clear_sink()/clear_data() to operate on js.* and shared.get_mut() buffers/metadata and updates cleanup ordering for lifetimes and request-body detachment.
on_body_received and buffering completion
src/runtime/webcore/fetch/FetchTasklet.rs
Reworks on_body_received success/reject paths, scheduled-buffer draining, streaming vs buffered transitions, and final BodyValue::resolve wiring using t.js and the guarded handoff.
on_progress_update, certificate, and promise dispatch
src/runtime/webcore/fetch/FetchTasklet.rs
Rewrites on_progress_update to use split() + shared.lock(); updates certificate/checkServerIdentity race handling and promise resolve/reject via JsState and Holder.
Readable stream and response assembly
src/runtime/webcore/fetch/FetchTasklet.rs
Assigns readable-stream refs into JsState, drains shared.scheduled_response_buffer when streaming, and builds Body/Response from shared handoff while managing js wait flags and ignore_data.
Request streaming, abort listener, and write paths
src/runtime/webcore/fetch/FetchTasklet.rs
Stores abort reason in js, cancels sink/readable stream correctly, introduces release_drain_task_ref, and updates request write/abort flows to use js flags and release logic.
HTTP-thread callback and staging
src/runtime/webcore/fetch/FetchTasklet.rs
HTTP callback locks task_ref.shared, merges progress via HttpHandoff::merge_result, stages response bytes, respects ignore_data, clears buffers on shutdown with the lock held, and performs controlled ref releases.
HTTPThread shutdown request-body deinit
src/http/HTTPThread.rs
dealloc_in_flight_for_exit now explicitly calls deinit() on state.original_request_body before dropping the state.
Fetch exit-in-flight tests
test/js/web/fetch/fetch-exit-in-flight.test.ts
Adds child-process-based tests that spawn a VM with BUN_DESTRUCT_VM_ON_EXIT and ASAN/LSAN to validate shutdown safety for chunked responses, stalled responses, and streaming uploads that call process.exit() mid-flight.

Possibly related PRs

  • oven-sh/bun#31325: Related FetchTasklet changes touching on_progress_update and shutdown/progress handling.

Suggested reviewers

  • Jarred-Sumner
🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically summarizes the main refactoring goal: encoding FetchTasklet's cross-thread ownership in the type system, which directly corresponds to the changeset's core objective.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description check ✅ Passed The PR description is comprehensive and addresses the template requirements with clear sections explaining what the PR does and how verification was performed.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/runtime/webcore/fetch/FetchTasklet.rs (1)

856-899: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Balance the streaming-upload ref during shutdown.

release_at_shutdown() only accounts for the two baseline refs, but the extra sink ref taken on Line 917 is only released from write_end_request() on Lines 2349, 2361, and 2376. If process.exit() interrupts a fetch while the request-body stream is still open, write_end_request() never runs, the refcount never reaches zero, and clear_sink() / clear_data() are skipped for that tasklet.

Please make the shutdown path balance an outstanding sink ref too, or guarantee that the sink ref cannot still exist once release_at_shutdown() is reachable.

Also applies to: 901-922, 2343-2377


ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: e33071d9-ce72-415f-b06a-0ebc52c3a850

📥 Commits

Reviewing files that changed from the base of the PR and between 3f153b9 and b21b592.

📒 Files selected for processing (4)
  • src/jsc/Weak.rs
  • src/runtime/dispatch.rs
  • src/runtime/webcore/fetch/FetchTasklet.rs
  • src/threading/guarded.rs

@alii

alii commented Jun 3, 2026

Copy link
Copy Markdown
Member Author

@robobun adopt

@robobun

robobun commented Jun 3, 2026

Copy link
Copy Markdown
Collaborator

🤖 Adopted this PR.

CI status: @alii merged main (cbd7951); e6abfab adds claude[bot]'s two follow-ups (defensive streaming-refs take in the fourth exit path + GuardedLock comment accuracy) and triggers a fresh run. The prior builds' only failures were bunx.test.ts (code 1 on every lane — reproduced locally with the main release binary, upstream registry breakage unrelated to this diff) and a darwin-14 terminal.test.ts PTY flake. The fetch-exit-in-flight fixtures passed on all lanes in every build, and debian-13-x64-asan-test-bun (the lane that leak-checks them) has been green throughout.

The shutdown-leak fix: process.exit() during a streaming upload leaked the whole FetchTasklet ⇄ Box<AsyncHTTP> ⇄ sink ⇄ ThreadSafeStreamBuffer chain (~4KB/request, LSan-visible). The sink/drain refs release on the JS thread (parked at exit; queued drain nodes are ManagedTask-tagged and unreleasable by the shutdown drain), and the HTTP-side buffer ref was never released at exit. Now balanced via take_streaming_refs_for_exit — an atomic swap-take called from every exit path, so whichever reaches a tasklet first claims the refs and double-release is impossible by construction. Resolves CodeRabbit's original finding; documented in the PR body as the one deliberate behavior change.

Tests (test/js/web/fetch/fetch-exit-in-flight.test.ts): three process.exit()-with-in-flight-fetch fixtures (mid-download, pre-metadata, mid streaming upload) with BUN_DESTRUCT_VM_ON_EXIT=1 + LeakSanitizer on ASAN builds. The mid-upload test fails without the shutdown fix (exit 23, leaked chain in the report) and passes with it.

Earlier adoption work: merge conflict with #31746 resolved; refactor spot-checks; full fetch/body suite validation against a merge-base debug build; all bot review threads addressed and resolved.

@claude claude Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated review found no issues, but I'm deferring to a human reviewer given this restructures the cross-thread refcount and locking protocol for FetchTasklet — the kind of change where a second pair of eyes on the lock windows and ref-release sites is worth having.

Extended reasoning...

Overview

This PR refactors FetchTasklet (~700 lines of diff in that file alone) to encode its cross-thread ownership in the type system: the free mutex: Mutex becomes shared: Guarded<HttpHandoff>, fields are partitioned into JsState (JS-thread-only) and HttpHandoff (lock-protected), the refcount now starts at init_exact_refs(2) instead of init() + a later ref_() in queue(), and each release site is given a named wrapper (release_js_ref, release_http_ref, etc.). Supporting changes: ignore_data and a new is_http2 mirror become atomics, GuardedLock is made !Send/!Sync, and jsc::Weak gains an unsafe fn create_ptr for the split-borrow caller. dispatch.rs is a one-line rename to use the new release_js_ref.

Security risks

None identified. This is internal memory-ownership and concurrency plumbing for the fetch implementation; no auth, crypto, input parsing, or trust-boundary surfaces are touched. The new unsafe fn Weak::create_ptr widens the API surface slightly but carries a documented safety contract and the safe create now delegates to it.

Level of scrutiny

High. FetchTasklet is the cross-thread coordination point for every fetch() call — it juggles an intrusive thread-safe refcount, a JS↔HTTP handoff mutex, lifetime-erased self-referential borrows, and JSC GC-finalizer re-entry. The PR is described as behavior-preserving and the bulk is mechanical field re-homing, but several changes are semantically load-bearing rather than pure renames: moving the second baseline ref from queue() to init_exact_refs(2) (with the PR noting this relies on get() staying infallible after the Box::new), converting ignore_data to a relaxed atomic, mirroring is_http2 into a separate atomic for the lock-free skip_chunked_framing reader, and threading the lock guard by value through the on_progress_update cleanup closure. Each looks correct on inspection and the PR description argues each carefully, but these are exactly the spots where a subtle ordering or aliasing mistake would manifest as a rare UAF or deadlock under load.

Other factors

The PR is exceptionally well-documented (full ref table, lock-invariant doc, per-commit staging) and the existing fetch test suite passes with the same two pre-existing failures as main. The bug-hunting system found nothing. Still, given the size, the concentration of unsafe, raw pointers, and cross-thread refcounting, and that this is production-critical hot-path code, a human reviewer familiar with the FetchTasklet/HTTP-thread lifecycle should sign off — auto-approval is not appropriate here.

@alii

alii commented Jun 4, 2026

Copy link
Copy Markdown
Member Author

@robobun fix conflict

robobun and others added 4 commits June 4, 2026 00:53
…own)

Three child-process fixtures pin the HTTP-thread shutdown protocol that
dealloc_in_flight_for_exit / FetchTasklet::release_at_shutdown implement:
exit mid-download (metadata accepted, body pending), exit before any
response bytes (nothing queued), and exit mid streaming upload
(ResumableSink attached). A ref-protocol mistake on these paths surfaces
as a wrong exit code (crash/ASAN abort) or a hung exit; none of these
states had coverage before.
process.exit() during a streaming request-body upload leaked the whole
in-flight chain (FetchTasklet, Box<AsyncHTTP>, ResumableSink,
ThreadSafeStreamBuffer — LSan-visible with BUN_DESTRUCT_VM_ON_EXIT=1):
the sink ref and any queued drain-task ref are normally dropped on the
JS thread, which is parked at exit, and the queued drain nodes are
ManagedTask-tagged so release_queued_tasks_for_shutdown cannot release
them. Track both (sink_ref_held, queued_drain_tasks) and balance them
in release_at_shutdown; release the HTTP-side ThreadSafeStreamBuffer
ref in dealloc_in_flight_for_exit the way InternalState::reset does.

The fetch-exit-in-flight tests now run their fixtures with
BUN_DESTRUCT_VM_ON_EXIT=1 and LeakSanitizer enabled (inert on non-ASAN
builds), so an unbalanced ref on any of the three exit paths fails as
exit 23 with the leaked chain in the report.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/js/web/fetch/fetch-exit-in-flight.test.ts`:
- Around line 43-48: Remove the stale ASAN stderr filtering: delete the
filteredStderr construction (the .split/.filter/.join block that removes lines
starting with "WARNING: ASAN interferes") and update any uses of filteredStderr
to use stderr directly (so assertions like expect(filteredStderr).toBe("")
become expect(stderr).toBe("")). This ensures the test relies on bunEnv-provided
stable stderr and will surface real diagnostics.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 9606bbf5-10b4-4b2c-a6e4-177431f5e148

📥 Commits

Reviewing files that changed from the base of the PR and between 92ec278 and eab71d3.

📒 Files selected for processing (3)
  • src/http/HTTPThread.rs
  • src/runtime/webcore/fetch/FetchTasklet.rs
  • test/js/web/fetch/fetch-exit-in-flight.test.ts

Comment thread test/js/web/fetch/fetch-exit-in-flight.test.ts Outdated
Bun subprocesses no longer emit the ASAN signal-handler startup warning,
so assert on raw stderr; any stderr output now fails the test loudly.
Comment thread src/http/HTTPThread.rs
A streaming upload's final callback can land between is_shutting_down
flipping true and SHUTDOWN_REQUESTED being observed (global_exit runs
worker termination in between); it then leaves in_flight, so
release_at_shutdown never runs for it and the sink/drain refs kept the
chain above zero. Mirror release_at_shutdown's sink_ref_held /
queued_drain_tasks balancing before the two baseline releases. The
JS thread is not parked yet in this window but never ticks the event
loop again, so the JS-side release sites cannot race the loads.
Comment thread src/runtime/dispatch.rs Outdated
A final callback that loses the has_schedule_callback CAS in the exit
window returns before the shutdown branch and leaves in_flight, so
neither release_at_shutdown nor callback's shutdown branch balanced the
sink/drain refs for it — only the queued progress node survives, into
release_queued_tasks_for_shutdown. Centralize the balancing in
take_streaming_refs_for_exit (atomic swap take of sink_ref_held and
queued_drain_tasks) and call it from all three exit paths; whichever
reaches the tasklet first claims the refs, the rest see zero, so paths
that can both run for one tasklet cannot double-release.
Comment thread test/js/web/fetch/fetch-exit-in-flight.test.ts Outdated
robobun and others added 3 commits June 4, 2026 03:22
The failing mode includes the child's LeakSanitizer symbolizing a large
leak report under the ASAN debug build, which exceeds the default
per-test timeout on loaded CI; name the constant and justify the
exception to the no-timeouts convention.
Comment thread src/runtime/dispatch.rs
Comment thread src/threading/guarded.rs Outdated
… too

The fourth is_shutting_down exit path (a final progress node dispatched
after the flag flips) released only the JS baseline ref; the entry had
already left in_flight and a dequeued node never reaches the shutdown
queue drain, so the sink/drain refs would strand there. The JS loop
does not tick in that window today, so this is defensive symmetry — the
take is an idempotent no-op when another path already claimed the refs.

Also correct the GuardedLock comment: std::sync::MutexGuard is Sync
where T: Sync; the raw-pointer PhantomData here is stricter (!Send and
!Sync), which nothing depends on either way.

@claude claude Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All prior review feedback has been addressed and no new issues found, but given the ~1300-line restructuring of FetchTasklet's cross-thread locking/refcount discipline plus the new unsafe Weak::create_ptr and GuardedLock auto-trait change, this should get a human pass before merging.

Extended reasoning...

Overview

This PR restructures FetchTasklet (src/runtime/webcore/fetch/FetchTasklet.rs, +829/−498) to encode its cross-thread ownership in the type system: the free Mutex becomes Guarded<HttpHandoff>, JS-thread state is split into JsState, the refcount protocol is named (release_js_ref/release_http_ref/release_sink_ref/release_drain_task_ref, init_exact_refs(2)), and the HTTP→JS buffer handoff is factored into HttpHandoff::{merge_result, stage_response_bytes}. Supporting changes: GuardedLock gains a PhantomData<*const ()> to become !Send/!Sync, jsc::Weak gains an unsafe fn create_ptr for split-borrow callers, HTTPThread::dealloc_in_flight_for_exit now deinits the streaming request body, and dispatch.rs's shutdown FetchTasklet arm now drains streaming refs via the new take_streaming_refs_for_exit. One deliberate behavior change (a streaming-upload exit-time leak fix) plus three new LSan-backed exit fixtures.

Security risks

No direct attack surface change — this is internal lifetime/ownership plumbing for fetch(). The risk class is memory safety (UAF/double-free across the JS↔HTTP thread boundary) rather than injection/auth. The new unsafe fn Weak::create_ptr shifts a liveness proof from the type system to caller documentation; the !Send marker on GuardedLock is a soundness improvement. No user-controlled data handling changed.

Level of scrutiny

High. This is concurrency-critical, production-hot-path code (every fetch() flows through it) with hand-managed atomic refcounts, lock-window invariants that the PR claims are preserved "by construction", and four exit-time release paths whose double-release safety depends on an idempotent atomic swap. The PR description is excellent and the commits are staged for review, but verifying that every guard drop lands where the old manual unlock() did, and that the init_exact_refs(2) change is balanced on every path, requires a maintainer who knows the prior FetchTasklet locking discipline. This is well outside the "simple/mechanical" bar for auto-approval.

Other factors

All five of my earlier inline findings (the three exit-path ref-balance gaps, the MutexGuard Sync doc inaccuracy, and the test-timeout convention) were addressed in 0d2ddc2 / 2bb34e8 / e6abfab / a09b295 and the threads are resolved. The bug-hunting system found nothing on this revision. CI's only failure (bunx.test.ts) is an unrelated upstream registry issue per robobun, and the new fetch-exit-in-flight fixtures are green on every lane including ASAN. The remaining reason to defer is purely scope and subject matter, not any open concern.

alii added 2 commits June 4, 2026 15:35
The take protocol assumed the JS-side release sites cannot run once
is_shutting_down flips, but the JS thread is still doing global_exit
cleanup in that window: server socket teardown fires user abort
listeners (RequestContext::on_abort has no shutdown gate) which can
reach sink.cancel -> write_end_request -> release_sink_ref
synchronously, and on_abort's microtask drain can resume a parked sink
continuation to the same effect. If a final callback's exit-branch take
won that race, both threads released the same sink ref and the JS-side
plain deref freed the tasklet inline while the HTTP side still used it.

Make the releases claim-checked: release_sink_ref derefs only if it
swapped the marker from true, release_drain_task_ref only if the
counter was nonzero (also removes the u32 underflow if a queued drain
ever raced a take). Safe in the stolen case because the taker's
deref_from_thread parks the box during shutdown instead of freeing it,
so the marker RMW is always on live memory. Correct the SAFETY comments
that stated the race could not happen.
WTF::ParkingLot lazily allocates a never-freed per-thread ThreadData
the first time a thread parks, which LeakSanitizer reports when
process.exit() interrupts an HTTP thread mid-park. This only shows on
native macOS ASAN builds (darwin CI cross-compiles without ASAN), where
it made all three fetch-exit-in-flight fixtures fail on a default
Apple Silicon debug build. Verified the suppression masks nothing: the
real FetchTasklet/ThreadSafeStreamBuffer/ResumableSink leak chain is
still reported on pre-fix code with it in place.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants