sql: fix sql.close() hanging after closing a reserved connection#32101
sql: fix sql.close() hanging after closing a reserved connection#32101robobun wants to merge 8 commits into
Conversation
|
Updated 5:17 PM PT - Jun 12th, 2026
❌ @robobun, your commit b61892b has 3 failures in
🧪 To try this PR locally: bunx bun-pr 32101That installs a local version of the PR into your bun-32101 --bun |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughCentralizes idempotent reserved-connection shutdown, ensures pool.release() on pooled-connection close paths, guards per-connection queryCount from underflow, and adds Postgres/MySQL regression tests exercising reserved.close() races and timeouts. ChangesReserved Connection Closure and Pool Slot Release
Possibly Related Issues
Suggested Reviewers
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@test/js/sql/sql-reserved-close.test.ts`:
- Line 255: Tests currently create connections with "const sql = new SQL(...)"
which can leak if an assertion fails; change these allocations (the SQL
constructor usage that binds to the local variable "sql") to register
deterministic cleanup before assertions by either using "await using sql = new
SQL(...)" or by wrapping the test body in try/finally and closing "sql" in
finally; update every occurrence that declares "sql" in these tests (the
variable named sql and instances of new SQL(...)) so the resource is disposed
even on failures.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 58b42e8b-5b06-4a4b-839f-448fbf30ad0a
📒 Files selected for processing (4)
src/js/bun/sql.tssrc/js/internal/sql/mysql.tssrc/js/internal/sql/postgres.tstest/js/sql/sql-reserved-close.test.ts
|
CI status for the current head (b61892b, build 62140, complete): 284 jobs passed including every sql lane and the sql-reserved-close suite (10 tests). Two red jobs, both macOS infra/flake with no overlap with this diff: macOS 14 x64 timed out in test/js/bun/terminal/terminal.test.ts (Bun.spawn PTY tests, 90s timeouts with retries), and macOS 14 aarch64's tart VM hook received a cancellation signal mid-run (no test failures in its log). Every earlier run failed on a different, equally unrelated lane. Diff is ready for review. |
Jarred-Sumner
left a comment
There was a problem hiding this comment.
This is a memory leak. We cannot reference functions from the parent scope.
|
Reworked in 82424e8. The close path no longer creates any per-reservation closures: the connection close handler is onReservedConnectionClosed.bind(state, pooledConnection) (same shape as the previous onTransactionDisconnected.bind(state)), and reserved.close() now calls module-level named functions, with setTimeout(fn, ms, ...args) and a bound callback for the drain race, so nothing captures the reservation's lexical scope. Tests still pass (7/7 mock suite plus real postgres smoke for the repro, release, both timeout arms, and server-kill cases). |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/js/bun/sql.ts`:
- Around line 429-439: The current chain uses
Promise.all(Promise.all(pending_queries), Promise.all(pending_transactions))
which fails fast and can trigger reservedCloseDrained prematurely; change the
wait to a non-fail-fast settle (e.g. use Promise.allSettled on the combined
arrays) so reservedCloseDrained(timer, state, pooledConnection, resolve) only
runs after every tracked promise from state.queries and reservedTransaction has
settled; update the Promise.all(...) call to
Promise.allSettled([...pending_queries, ...pending_transactions]) and keep the
drained handler and timer logic (reservedCloseDrained,
reservedCloseTimeoutFired, timer) unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 3d3c11e7-7ef3-492a-9e9d-b7de5d4353c0
📒 Files selected for processing (1)
src/js/bun/sql.ts
82424e8 to
742e15f
Compare
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@test/js/sql/sql-reserved-close.test.ts`:
- Around line 125-126: The await only listens for "listening" on server
(server.listen(0, "127.0.0.1"); await once(server, "listening");) so startup
errors or premature close hang tests; replace with a promise that resolves on
"listening" and rejects on "error" and "close" (and remove listeners after
settle) so listen failures surface immediately; apply the same change to the
second occurrence around lines 264-265.
- Around line 429-462: Add a MySQL test that exercises the
dead-connection/timeout path (the same variant covered for Postgres) so the
MySQL-specific Adapter.release() guard in src/js/internal/sql/mysql.ts is
exercised; specifically, in test/js/sql/sql-reserved-close.test.ts add a test
that reserves a connection via sql.reserve(), runs a query, then force-closes or
kills the underlying reserved connection (simulate a dead/terminated connection)
before calling reserved.release() or reserved.close(), then verify the pool can
still run queries and that sql.close() resolves; use the existing helpers
startMysqlServer(), SQL, sql.reserve(), reserved.close()/reserved.release() and
mirror the Postgres dead-connection test flow to cover the branch.
- Around line 396-400: The test currently asserts expect(unhandled).toEqual([])
in the same macrotask as the cancellation path; insert a macrotask barrier
(await Bun.sleep(0)) before the assertion so any enqueued "unhandledRejection"
can run while the listener (onUnhandled) is still registered; specifically, in
the test around the reserved.close()/cancellation code ensure you await
Bun.sleep(0) after the cancellation/pending resolution and before calling
expect(unhandled).toEqual([]), and only then call
process.off("unhandledRejection", onUnhandled).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: fb46eaa0-8350-47aa-8c54-8fdc621ca347
📒 Files selected for processing (4)
src/js/bun/sql.tssrc/js/internal/sql/mysql.tssrc/js/internal/sql/postgres.tstest/js/sql/sql-reserved-close.test.ts
…oses
reserved.close() closed the underlying connection without going through
the pool's release path, so the pool's totalQueries counter never
dropped and a later graceful sql.close() waited forever. The same slot
leaked when the server dropped a reserved connection.
Release the reservation from the reserve wrapper's connection close
handler, so both explicit close() and unexpected disconnects return the
slot. Guard the queryCount decrement in release() against counts
already zeroed by #finishClose, since a negative count leaves the
reserved flag stuck after the slot reconnects.
Also make reserved.close({ timeout }) actually close the connection
when pending queries finish before the timeout (that branch previously
resolved without closing or releasing anything), and stop its internal
query-watching promise chain from surfacing cancelled queries as
unhandled rejections.
Fixes #32099
… closures The reserved connection's close handler is stored on the long-lived pooled connection, so a closure there would retain the whole per-reservation lexical environment. Use module-level and constructor-level named functions with explicitly bound state, matching how the other handlers in this file are wired.
…ut }) Promise.all fails fast, so one rejected query closed the connection and cancelled the rest of the in-flight work before the grace period was over. Use Promise.allSettled so the drained path only runs once all tracked queries and transactions have settled, matching the pool-level close behavior.
… unhandled-rejection window
50a9f7f to
6a809f4
Compare
…n state Throwing ERR_INVALID_ARG_VALUE after clearing acceptQueries left the reservation stranded: a retry close() no-oped on the state guard, release() rejected on the same guard, and the pool slot was never returned, so a graceful sql.close() hung.
There was a problem hiding this comment.
Thanks for moving the timeout validation ahead of the state mutation in b61892b (and the matching test) — that closes out my last flag. I have nothing further, but the release() queryCount guard in shared.ts and the new close-path pool.release() touch pool accounting for every adapter, so I'd like a human to give the final sign-off.
Extended reasoning...
Overview
This PR fixes #32099: sql.close() hung after reserved.close() because the reservation's pool slot (queryCount/totalQueries) was never returned. The fix spans:
src/js/bun/sql.ts(~70 lines changed): addsonReservedConnectionClosed(callspool.release()when the underlying connection closes), refactorsreserved_sql.close()so all three arms (no pending work, timeout fired, drained before timeout) converge on an idempotentcloseReservedConnection(), switches the drain wait toPromise.allSettled, and moves timeout validation ahead of theacceptQueriesmutation.src/js/internal/sql/shared.ts(~9 lines): guardsconnection.queryCount--in the consolidatedBaseSQLAdapter.release()so a late release after#finishClosezeroed the counter cannot drive it negative and corrupt the reserved-flag handling.test/js/sql/sql-reserved-close.test.ts(~500 lines new): nine tests against in-process mock postgres/mysql servers covering the headline hang, server-dropped reserved connections, bothclose({ timeout })arms, fail-fast vs allSettled drain, invalid-timeout, and pool reuse after close.
Security risks
None. No auth, crypto, permissions, or untrusted-input parsing is touched. The change is internal pool bookkeeping; the mock servers in tests are loopback-only and disposed via await using.
Level of scrutiny
Medium-high. This is production-critical connection-pool accounting shared by the postgres and mysql adapters. The queryCount > 0 guard in release() runs on every connection return, and the new pool.release() on the reserved close path interacts with the existing #finishClose → release(conn, true) flow. The reasoning in the PR description and inline comments is sound and the test matrix is thorough (including the new invalid-timeout regression test for the validate-before-mutate fix in b61892b), but a regression here would manifest as pool slot leaks or double-releases under load — worth a maintainer's eyes.
Other factors
This PR has been through several review rounds (CodeRabbit and three of my own inline flags), all of which the author addressed promptly: await using cleanup, Promise.allSettled for the drain wait, dropping the dead rejection handler, the mysql server-drop test, the Bun.sleep(0) macrotask barrier, and most recently the validate-before-mutate ordering. The sibling transaction_sql.close({ timeout }) issue I flagged was deliberately deferred to #32148. No bugs were found in the current revision. CI build #62140 is running on b61892b. CodeRabbit suggested cirospaciari/alii as reviewers, which matches the area ownership.
Fixes #32099
Repro
Reproduced on 1.4.0 and main against postgres and mysql.
reserved.release()instead ofreserved.close()works.Cause
A reservation checks out one pool slot (
connection.queryCount++/totalQueries++at reserve grant).reserved.release()returns it viapool.release(), butreserved.close()calledpooledConnection.close()directly. The socket close path (#finishClose) callsrelease(connection, /*connectingEvent*/ true), which skips the counter decrement, sototalQueriesstayed above zero and a gracefulsql.close()parked ononAllQueriesFinishedforever. The same slot leaked when the server dropped a reserved connection, sincerelease()rejects once the wrapper is marked closed.Two more problems in the same path:
reserved.close({ timeout })with pending queries that finished before the timeout resolved without closing or releasing anything, stranding the connection (still physically open, still counted as checked out).Promise.all(pending_queries)chain was dropped, so cancelled in-flight queries surfaced as unhandled rejections (process exits nonzero even when the user catches their own query promise).Fix
All in the shared JS pool code, covering the postgres and mysql adapters (sqlite rejects
reserve()):src/js/bun/sql.ts: the reserve wrapper's connection close handler now returns the reservation viapool.release(), so explicitclose()and unexpected disconnects both release the slot; the handler is detached onrelease()as before, so the slot is returned exactly once.reserved.close()'s three arms (no pending work, timeout fired, queries drained before timeout) now converge on one idempotentcloseReservedConnection(), and the grace period waits onPromise.allSettled, so one failing query neither cuts the wait short for the rest nor surfaces as an unhandled rejection. All handlers are module-level or constructor-level named functions with bound state; the close path creates no per-reservation closures (the stored close handler previously retained the whole reservation's lexical scope).src/js/internal/sql/shared.ts: the consolidatedrelease()(shared by the postgres and mysql adapters since sql: consolidate JS pool/connection plumbing into BasePooledConnection in shared.ts #32145) only decrementsconnection.queryCountwhen it is still positive.#finishClosezeroes it when a connection dies, and a late release driving it negative left thereservedflag permanently stuck after the slot reconnected (the connection could never be reserved again).totalQueriesstill always drops, since every release pairs with exactly one checkout.Tests
test/js/sql/sql-reserved-close.test.ts, using minimal in-process mock postgres/mysql servers (no docker), all failing by hang/timeout on the unfixed build and passing with the fix:sql.close()resolves afterreserved.close()sql.close()resolves after the server drops a reserved connectionreserved.close({ timeout })closes the connection once pending queries finishreserved.close({ timeout })keeps waiting for remaining queries when one failsreserved.close()(slot reconnects, can be reserved/released/reserved again)Also verified against real servers (postgres 5432, mariadb 3306): the issue's repro, the
release()control case, transactions (commit/rollback, on reserved connections),sql.close()called beforereserved.close(),pg_terminate_backendon a reserved connection, and 20 queued pool queries racing a reservation close.Rebased onto main after #32145 consolidated the adapter pool code into
shared.ts: the two identical per-adapterrelease()guards from earlier revisions of this PR became a single guard in the sharedrelease(), and the adapter files are no longer touched. Re-verified on the new base: the full suite fails without the fix (9/9 by hang) and passes with it, plus the neighboring non-docker sql suites stay green.