test(sql): bound spawned fixture lifetime so a test timeout cannot leave orphans#32207
Conversation
…ave orphans When a test that does `await using proc = Bun.spawn(...)` times out before the using-scope exits, the dispose never runs and the subprocess survives indefinitely. The sql fixture subprocesses connect to a (mock or real) database and await the pool draining; under a regression that hangs the pool, the fixture never exits, the test times out, and the fixture is orphaned. Locally observed 16 such orphans pinning the machine at load ~100 for hours. Adding `timeout` to the Bun.spawn options gives the child its own hard deadline independent of the test runner, so an abandoned fixture self-kills within a minute even when the parent test has already moved on.
|
Updated 8:28 PM PT - Jun 12th, 2026
❌ @alii, your commit 34cc345 has 3 failures in
🧪 To try this PR locally: bunx bun-pr 32207That installs a local version of the PR into your bun-32207 --bun |
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (8)
Disabled knowledge base sources:
WalkthroughEight MySQL/SQL regression and integration tests add explicit ChangesSQL Test Fixture Spawn Timeouts
Possibly related PRs
Comment |
| stdout: "ignore", | ||
| stderr: "ignore", | ||
| stdin: "ignore", | ||
| timeout: 60_000, |
There was a problem hiding this comment.
🔴 Unlike the other 8 spawns in this PR (short-lived bun fixture clients), mysqld_safe is the long-running MariaDB server daemon — it is supposed to stay alive for the duration of the test, and .unref() does not cancel the spawn timeout timer. With waitForSocket budgeted up to 30s for cold start, plus provisionTcpUser and a debug/ASAN fixture run, the 60s timer can fire mid-test and SIGTERM the database out from under the fixture, causing a spurious failure. The PR's rationale (fixture clients hanging on pool drain) doesn't apply to the database daemon itself; suggest dropping this hunk or using a much larger bound (e.g. 300_000+).
Extended reasoning...
What the bug is
This PR adds timeout: 60_000 to nine Bun.spawn calls. Eight of them spawn short-lived bun fixture subprocesses that connect to a database, run a query, and exit — exactly the case the PR description targets. But the ninth, at test/js/sql/sql-mysql-column-name-digits.test.ts:91, spawns mysqld_safe — the MariaDB server daemon itself. That process is intended to outlive the test body: it is .unref()'d, its stdio is ignored, and ensureServerStarted() early-returns if the socket already exists from a prior run. Putting a 60s hard kill on it means the database server can be terminated while the test is still using it.
Why .unref() doesn't help
I verified in src/runtime/api/bun/subprocess.zig that jsUnref() only calls process.disableKeepingEventLoopAlive() and unrefs stdin/stdout/stderr — it does not touch event_loop_timer. Meanwhile computeHasPendingActivity() returns true while !process.hasExited(), which keeps the JS wrapper Strong-referenced, so GC will not collect the Subprocess and cancel the timer via finalize() either. Therefore, 60 seconds after ensureServerStarted() runs, timeoutCallback() unconditionally calls tryKill(killSignal) on the still-running mysqld_safe process — regardless of .unref().
Step-by-step trigger
- Test runs in the non-Docker branch (sandboxed dev/CI-gate container with native MariaDB,
isDockerEnabled() === false). beforeAll→ensureServerStarted()spawnsmysqld_safeat T=0 withtimeout: 60_000and calls.unref().waitForSocket(30_000)polls every 250ms. On a cold container start, MariaDB initialization can take a substantial fraction of that 30s budget — say T≈25s when the socket appears.provisionTcpUser()opens a root socket connection and runs 6 sequential SQL statements — a few more seconds.- The test body calls
runFixture(url), which spawns a debug/ASAN bun subprocess (the file's other hunk), connects over TCP, and runs the query. Debug+ASAN bun startup alone is routinely 10–20s. - At T=60s the timeout timer fires and SIGTERMs
mysqld_safe. MariaDB shuts down. - The fixture's connection drops mid-query; it never prints
CONNECTED; the test throwscould not connect to mysql://bun_sql_test@127.0.0.1:3306/...— a spurious failure unrelated to the column-name-digits regression under test.
Why the PR rationale doesn't apply here
The PR description says: "the sql fixture subprocesses connect to a database and await the pool draining; under a regression that hangs the pool, the fixture never exits". mysqld_safe is not a bun SQL fixture, has no JS pool to drain, and is not subject to a #32145-style hang. The timeout also doesn't achieve the orphan-prevention goal for this spawn: because the process is .unref()'d and never awaited, when the test-runner process exits the timer dies with it and mysqld_safe is orphaned regardless. So this hunk adds a flake risk without providing the intended safety.
Fix
Either drop this one hunk (the mysqld_safe spawn), or use a bound comfortably larger than the worst-case test duration — e.g. timeout: 300_000 or more — so the daemon is reaped only when the whole test/js/sql/ run is clearly stuck, not while a single slow cold-start test is still in flight.
When a test that does
await using proc = Bun.spawn(...)times out before the using-scope exits, the dispose never runs and the subprocess survives indefinitely. The sql fixture subprocesses connect to a (mock or real) database and await the pool draining; under a regression that hangs the pool (e.g. while iterating on #32145), the fixture never exits, the test times out, and the fixture is orphaned.Locally observed 16 such orphans pinning the machine at load ~100 for hours after a
bun bd test test/js/sql/run that exercised a buggy intermediate commit.Adding
timeoutto theBun.spawnoptions gives the child its own hard deadline independent of the test runner, so an abandoned fixture self-kills within a minute even when the parent test has already moved on.sql.test.tsalready hadtimeouton all three of its spawns; this brings the other 8 files in line.