sql: don't let a throwing onconnect/onclose callback corrupt the connection pool#32041
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (1)
WalkthroughMySQL and PostgreSQL pooled connection lifecycle handlers now run user onconnect/onclose inside try/finally (Postgres adds deferred close via process.nextTick) so internal bookkeeping, query notifications, and adapter release always execute; a new test suite validates thrown-hook behaviors across connect, refuse, and setup-failure scenarios. ChangesConnection Lifecycle Callback Exception Safety
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
|
Updated 5:02 PM PT - Jun 10th, 2026
❌ @robobun, your commit 7139c12 has some failures in 🧪 To try this PR locally: bunx bun-pr 32041That installs a local version of the PR into your bun-32041 --bun |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@test/js/sql/sql-onconnect-onclose-throw.test.ts`:
- Around line 167-246: Each test is missing an explicit assertion that the
subprocess stderr is empty; update each test that calls runFixture (including
tests using connectAndEnd("postgres"/"mysql"),
connectAndEnd("postgres","onclose"), failToConnect("postgres"/"mysql"), and the
inline fixture) to assert that the returned stderr is "" (e.g.
expect(stderr).toBe("")) and place that assertion before checking exitCode to
follow the guideline that output assertions come before exitCode checks; locate
uses of runFixture and add the stderr assertion accordingly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 51db9e2b-9c5e-4238-99b9-a2a43cf5f3f5
📒 Files selected for processing (3)
src/js/internal/sql/mysql.tssrc/js/internal/sql/postgres.tstest/js/sql/sql-onconnect-onclose-throw.test.ts
|
The remaining CI red is unrelated to this diff. Every failed lane in builds 61587 and 61603 fails on exactly one test, $ bun x --bun @angular/cli@latest --help
Node.js version v24.3.0 detected.
The Angular CLI requires a minimum Node.js version of v22.22.3 or v24.15.0 or v26.0.0.
$ echo $?
3The failure set is identical across both builds (different commits of this branch) and across all platforms, and main's last green build (61548, same base commit) predates the Angular release, so this will fail on any branch until the bunx test or the version handling is updated. The failing shards do not include this PR's test file; the new sql tests pass under |
|
Update after the main merge (thanks for the bunx skip): build 61612 is green on all 189 completed linux/windows/musl jobs. The one red job is darwin 26 aarch64, which timed out for reasons unrelated to this diff:
The sql tests from this PR ran in that same shard and passed 7 for 7. No further pushes from me to avoid spamming CI; happy for a maintainer to re-run the darwin lane or merge over it. |
…ection pool The user-provided onconnect/onclose callbacks ran before the pooled connection handlers updated their own bookkeeping. If the callback threw, the handler aborted early: state stayed pending, storedError was never recorded, pending queries were never notified, onFinish never ran and the connection was never released, so queries, connect() and end() hung forever. Run the callback in a try/finally so the bookkeeping always completes; the exception still propagates and is reported as an uncaughtException, as before. Also defer the synchronous onClose call in postgres createConnection's catch with process.nextTick, matching mysql. It could fire while the adapter was still filling this.connections, so pool methods that scan that array (flush, isConnected, close) threw a TypeError on the holes when called from inside the onclose callback. Fixes #32037
c845b16 to
1b86bd4
Compare
|
actually no mock server @robobun |
…s/mysql Replace the mock protocol servers with the docker-compose postgres_plain and mysql_plain services via describeWithContainer, matching the other sql tests. The connection-refused tests keep using a real closed port and the synchronous-failure test never dials, so those still run without docker.
|
Done in 7139c12: the established-connection tests (throwing onconnect, throwing onclose on sql.end()) now run against the docker-compose postgres_plain and mysql_plain services via describeWithContainer, same as the other sql tests, and assert the query result comes back despite the throwing hook. Verified the fixtures against real postgres and mariadb locally: they hang without the fix and complete with it. The connection-refused tests keep using a real closed port (as in sql-connect-error-reporting.test.ts) and the synchronous password-failure test never dials, so no mock servers remain. |
|
CI note on build 61829: the docker-backed sql tests from this PR ran against the real postgres_plain and mysql_plain compose services and passed 7/7 (debian x64 shard). The only failed job so far is debian x64-baseline on test/js/web/streams/streams-leak.test.ts ("native ReadableStream reuses the pull buffer across small reads", expected >20 chunks, got 9), a wire-coalescing timing assertion unrelated to this diff. Not pushing a retrigger to avoid churning CI. |
The re-integration commit hosts the #32041 try/finally and the #32097 forced-close settlement in one BasePooledConnection close handler, so pin their interaction per driver: onclose throws while the pool is force-closed mid-handshake, and close() must still resolve and the pending query must still reject.
…n in shared.ts (#32145) ## What this does Final slice from #31994 (closed as too big), on top of #32128, #32135, #32141. Consolidates the JS adapters' duplicated pool/connection/query plumbing from `src/js/internal/sql/{postgres,mysql,sqlite}.ts` into `src/js/internal/sql/shared.ts` as a `BasePooledConnection` class hierarchy. Net −995 lines across 4 files. Two commits: - **Cherry-pick from #31994** (1aa3dbb) — checks out `src/js/internal/sql/` from `origin/claude/split/sql` as-is. This intentionally clobbers two pool-lifecycle fixes that landed on `main` since #31994's merge-base: #32041 (throwing onconnect/onclose corrupting the pool) and #32097 (forced `close()` resolving mid-handshake). - **Re-integrate #32041 + #32097 into the new structure** (6e6fd4f) — re-hosts both fixes in `BasePooledConnection`: - #32041: try/finally around the user `onconnect`/`onclose` calls in `handleConnected` and `#finishClose` so a throwing callback never skips pool bookkeeping; the `createPooledConnectionHandle` catch always defers via `process.nextTick` (drops the per-driver `deferSyncCloseError` flag — both drivers now match). - #32097: `startConnection()` is `abstract Promise<void>` and both subclasses await + assign `this.connection` at creation; `#beginConnecting` awaits it and closes the handle if `onFinish` was set (pool force-closed) in the microtask window before it materialized. ## Behavioral equivalence The original `BasePooledConnection` extraction in #31994 carried a per-file behavioral-equivalence audit against its merge-base (placeholders, escaping, helper commands, error message text, pool/transaction semantics all preserved) and was green on full CI build 61383. This PR re-applies that diff and adds the two missing fixes; the load-bearing verification is below. ## Verification - `bun bd`: builds clean. - The three lifecycle test files (`sql-onconnect-onclose-throw.test.ts`, `sql-close-pending-connection.test.ts`, `sql-connect-error-reporting.test.ts`): 26 pass / 0 fail. - **New tests in this PR**: `sql-onconnect-onclose-throw.test.ts` gains a forced-close variant per driver (a throwing `onclose` while the pool is force-closed mid-handshake, against a fake `net` server). The two re-hosted fixes meet in `BasePooledConnection`'s close handler, and that interaction had no coverage. Note these pass against `origin/main`'s `src/js/internal/sql` too (verified): both underlying fixes already live on main per-driver, so versus main this PR is equivalence-preserving by design and no test can fail on one side only. The HEAD~1 stash test below is the proof that the re-integration commit is load-bearing. - **Load-bearing stash test**: built and ran the same three files at HEAD~1 (the cherry-pick before re-integration) — 6 fail (`mysql: forced close() resolves when called before the native handle is stored` timeout; `postgres: pool calls from onclose are safe when connecting fails synchronously` → `reentry threw: TypeError`; both drivers' `throwing onclose still rejects pending queries on connect refused` timeouts). With the re-integration: 26 pass. The diff is required for both fixes. - Full `test/js/sql/` against live `postgres_plain`/`mysql_plain`/`*_tls` containers: 1526 pass / 7 fail / 6 errors locally. Every failure reproduces on `origin/main` in the same environment — see Notes below. ## Notes on local-only test failures (none introduced by this PR) - `sql-mysql-query-string-leak.test.ts` fails locally on a debug+ASAN build for both `origin/main` (314.6 MiB) and this branch (317.2 MiB), within noise of each other; both exceed the 256 MiB ASAN threshold. A `heapStats()` probe of the same workload shows `MySQLQuery 2→1` after GC — wrappers are finalized correctly. The test passes on `main`'s CI (release build); the threshold is tuned for release RSS overhead, not local debug. - `sql.test.ts > query string memory leak test` (postgres): same category. - `sqlite-sql.test.ts > Query Normalization Fuzzing Tests > handles exotic but valid SQL patterns`: 5.5s timeout on debug builds — pre-existing per #31994's original verification notes. - `describeWithContainer` cold-start races: `docker compose up -d --wait` returns non-zero on an already-healthy container (`test/docker/index.ts:157`), causing beforeEach hook timeouts in `sql-mysql.test.ts` (TLS), `sql-mysql-bind-oob`, `sql-mysql.helpers`, `sql-mysql.auth`. Pre-existing harness flake; clears on warm rerun. The Rust-side changes from #32097 (`src/sql_jsc/`) were already on `main` before this PR — only the JS hooks needed re-integrating.
Fixes #32037.
Problem
In the postgres and mysql pooled-connection handlers (
#onConnected/#onCloseinsrc/js/internal/sql/postgres.tsandmysql.ts), the user-providedonconnect/onclosecallbacks ran before the pool updated its own bookkeeping. A throwing callback aborted the handler mid-way:statestayedpending,storedErrorwas never recorded, pending queries were never notified,onFinishnever ran, andrelease()never ran. Anything awaiting the pool hung forever:Same for
onclose: pending queries were never failed on a connect failure, andsql.end()never resolved.Fix
Run the callback in a
try/finallyso the bookkeeping always completes, at all four sites (postgres/mysql x onconnect/onclose). The exception still propagates out of the handler afterwards and is reported as an uncaughtException through the same channel as before; only the skipped bookkeeping changes. (sqlite.ts already guards its hooks.)Secondary fix in the same re-entrancy family, also called out in the issue: postgres's
createConnectioncatch invokedonClosesynchronously, which could run the user'sonclosewhile the adapter was still fillingthis.connectionsduring pool startup (reachable via apasswordfunction that throws). Pool methods that scan that array without the hole guardhasConnectionsAvailable()has (flush(),isConnected(), the privateclose()) threwTypeError: undefined is not an objectwhen called from inside the callback. The catch now defers viaprocess.nextTick, exactly like the identical catch in mysql.ts already did, so the callback never observes a half-filled pool.Verification
New
test/js/sql/sql-onconnect-onclose-throw.test.ts. The established-connection scenarios (throwingonconnect, throwingoncloseonsql.end()) run against the real docker-composepostgres_plainandmysql_plainservices viadescribeWithContainer, like the other sql tests. The connection-refused scenarios use a real closed port and the synchronous-failure scenario never dials, so those run without docker. On the unfixed build every scenario fails (the docker ones verified manually against real postgres and mariadb: the fixture hangs without the fix and completes with it):before the fix
With the fix all 7 pass, and the issue's original repro scripts against real postgres and mariadb servers now settle (
query result: [{"x":1}],sql.end()resolves) while the callback error still surfaces as an uncaughtException.sql-connect-error-reporting.test.ts,sql-mysql-clean-reentry.test.tsandsql-mysql-cached-error.test.tsstill pass.Note: this predates #31994 (which dedupes these two files into a shared base class); if that lands first, the same change applies once to its
shared.ts.Rebase note (after #32028 landed)
#32028 restructured
#onClose: connect failures now retry with backoff and theonclosecallback moved into a new#finishClose, which the retry timer also calls. Resolved by putting thetry/finallyaround the hook in#finishClose(covering both callers) and keeping#onClose's retry branch untouched (no user code runs there).#onConnectedkeeps main'sconnectStartedAt = 0reset inside thefinally. #32028 also split refused connections intoERR_*_CONNECTION_REFUSED(fail fast, not retried), so the two refused-port tests assert that code now.