Skip to content

sql: guard pool connection scans against unassigned slots during pool start#32201

Open
robobun wants to merge 2 commits into
mainfrom
farm/825180f5/sql-pool-hole-scans
Open

sql: guard pool connection scans against unassigned slots during pool start#32201
robobun wants to merge 2 commits into
mainfrom
farm/825180f5/sql-pool-hole-scans

Conversation

@robobun

@robobun robobun commented Jun 12, 2026

Copy link
Copy Markdown
Collaborator

First item of #32198. The second item there (the close({ timeout }) validation mismatch) is already addressed by the open #32100, which bounds the timeout at setTimeout's real millisecond limit and rewords the message, so it is deliberately not touched here.

Repro

No database server needed:

const sql = new Bun.SQL({
  adapter: "postgres",
  hostname: "127.0.0.1",
  port: 5432,
  username: "u",
  database: "d",
  max: 2,
  password: () => {
    sql.flush(); // TypeError: undefined is not an object (evaluating 'this.closed')
    return "";
  },
});
sql.connect().catch(() => {});

Cause

BaseSQLAdapter.connections is allocated as new Array(max) and filled one slot at a time in connect()'s pool-start loop. createPooledConnection can synchronously run user code: a function-valued password option is invoked synchronously by createPooledConnectionHandle. If that user code re-enters a pool method that scans this.connections, the scan hits slots that are still unassigned holes.

hasConnectionsAvailable() already guards for this (if (connection && ...)) and documents the scenario, but the equivalent loops in isConnected(), flush(), #close(), and the retry scan in connect() read connection.state unguarded and throw a raw TypeError: undefined is not an object (evaluating 'connection.state'). (In the observed errors JSC quotes a nearby expression, this.closed / this.readyConnections, due to line skew between the embedded and on-disk bundle source; the stack points at the connection.state reads.)

Fix

Optional-chain the state reads in those four loops (connection?.state), matching the guard hasConnectionsAvailable() already has, and document the hole invariant where the array is created. In the retry scan a hole counts as a connection still being created (all_closed = false), so the caller is queued and served when the pool finishes connecting. #close() does not appear reachable with holes today (pool start always queues a waiter first, so close defers), but it gets the same guard for consistency, which also covers the connections[i] = null slots it writes.

Verification

New test in test/js/sql/sql-close-pending-connection.test.ts re-enters sql.flush() and sql.connect() from a function-valued password during pool start, against a local TCP server that never answers (no database needed). On the unfixed build it fails with the TypeErrors above captured from both re-entry points; with the fix the whole file passes (bun bd test test/js/sql/sql-close-pending-connection.test.ts, 5 pass).

@coderabbitai

coderabbitai Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 4f21a48d-16e4-46e2-b808-40f6951d94da

📥 Commits

Reviewing files that changed from the base of the PR and between ac312f0 and e32d099.

📒 Files selected for processing (2)
  • src/js/internal/sql/shared.ts
  • test/js/sql/sql-close-pending-connection.test.ts

Walkthrough

This PR hardens BaseSQLAdapter's pool-initialization scanning against crashes when user code re-enters via synchronous callbacks (such as password) while the pool's connection array is still being allocated. The fix uses optional chaining in state checks to safely handle unassigned slots, validated by a new regression test.

Changes

Pool Startup Re-entrancy Safety

Layer / File(s) Summary
Pool startup re-entrancy defensive scanning
src/js/internal/sql/shared.ts, test/js/sql/sql-close-pending-connection.test.ts
Documents that createPooledConnection may re-enter adapter methods during pool startup when the connections array is only partially assigned. Hardens isConnected, flush, #close, and connect with optional chaining (connection?.state) to safely skip undefined slots. Validates with a regression test that re-enters flush() and connect() from a synchronous password() callback during pool start, verifying no crashes or type errors occur.

Possibly Related Issues

  • oven-sh/bun#32198: Directly fixes the unguarded re-entrancy failures by adding optional chaining guards in isConnected, flush, #close, and connect state scans.
  • oven-sh/bun#32037: Addresses the same pool startup re-entrancy hazard with undefined "hole" slots that the optional chaining hardening prevents.

Possibly Related PRs

  • oven-sh/bun#32145: This PR's hardening of BaseSQLAdapter pool startup logic builds directly on the pooled-connection refactoring introduced in that PR.
  • oven-sh/bun#32041: Both PRs address pool-start re-entrancy during connection slot allocation, with this PR preventing crashes when scanning partially-filled arrays while the other prevents callback-triggered corruption of pool bookkeeping.

Suggested Reviewers

  • alii
  • cirospaciari
🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately and concisely describes the main change: guarding pool connection scans against unassigned slots during pool start, which is the primary fix throughout the changeset.
Description check ✅ Passed The description provides detailed context including repro steps, root cause analysis, the fix approach, and verification details, matching the template's two required sections.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@robobun

robobun commented Jun 12, 2026

Copy link
Copy Markdown
Collaborator Author
Updated 10:00 PM PT - Jun 12th, 2026

@robobun, your commit e32d099 has 3 failures in Build #62195 (All Failures):


🧪   To try this PR locally:

bunx bun-pr 32201

That installs a local version of the PR into your bun-32201 executable, so you can run:

bun-32201 --bun

@claude claude Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — minimal optional-chain guards matching the existing hasConnectionsAvailable() pattern, with a solid regression test.

Extended reasoning...

Overview

This PR adds optional-chain guards (connection?.state) to four loops in BaseSQLAdapter (isConnected(), flush(), #close(), and the retry scan in connect()) that iterate over this.connections, plus a documenting comment at the array's allocation site. It also adds a regression test that re-enters sql.flush() and sql.connect() from a function-valued password callback during pool start.

The root cause is well-explained: connections is allocated as new Array(max) (a holey array) and filled slot-by-slot, and createPooledConnection can synchronously invoke user code (the password() function) before the slot is assigned. hasConnectionsAvailable() already had this exact guard with an explanatory comment; this PR simply brings the four sibling loops to parity.

Security risks

None. The change adds defensive null checks; it does not alter authentication, credential handling, or any data path. The password() callback was already user-controlled before this change.

Level of scrutiny

Low. The diff is four .?. substitutions plus comments. Each guarded read falls through to the correct semantic when the slot is a hole: isConnected()/flush() skip it, #close() has nothing to close, and the connect() retry scan treats it as pending (all_closed = false) so the caller is queued — which the PR description explicitly reasons about. The fix mirrors an established pattern in the same file rather than introducing new control flow.

Other factors

  • No CODEOWNERS apply to the touched paths.
  • The new test is self-contained (uses a local never-answering TCP server, no real database) and asserts both that password() is invoked synchronously per slot and that the re-entrant calls no longer throw.
  • The PR description correctly notes that #close() is not reachable with holes today but is guarded for consistency (and incidentally for the connections[i] = null slots it writes itself).
  • The bug-hunting system found no issues.

@claude claude Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — straightforward defensive null-guards matching the existing pattern in hasConnectionsAvailable(), with a regression test.

Extended reasoning...

Overview

This PR changes four connection.state reads to connection?.state in BaseSQLAdapter (isConnected(), flush(), #close(), and the retry scan in connect()), adds a comment documenting the hole invariant where this.connections = new Array(max) is allocated, and adds a regression test that re-enters pool methods from a function-valued password callback during pool start.

Security risks

None. The change is purely defensive optional-chaining on an internal array scan. No auth, crypto, input parsing, or external I/O is touched.

Level of scrutiny

Low. The fix is mechanical and follows an established pattern: hasConnectionsAvailable() already guards the same array with if (connection && ...) and documents the scenario. Each of the four new ?. reads has correct fall-through semantics for an undefined slot — isConnected()/flush() skip it, #close() hits no switch case and just nulls the slot, and the retry scan in connect() falls into the all_closed = false branch so the caller is queued (as the inline comment explains). When connection is defined, behavior is byte-identical to before.

Other factors

The new test exercises both sql.flush() and sql.connect() re-entry from the synchronous password() hook against a never-answering local TCP server (no real DB needed) and verifies no TypeErrors are thrown. The single CI failure (fetch-abort-slow-connect.test.ts on macOS aarch64) is unrelated to this SQL change. No CODEOWNERS apply to the touched paths, and there are no outstanding human review comments.

@robobun

robobun commented Jun 13, 2026

Copy link
Copy Markdown
Collaborator Author

CI is green on this diff. The one red lane is `darwin 14 aarch64 - test-bun`, which hit the 45 minute lane timeout at file 2053 of 2183, well past anything this PR touches. The test added here (`test/js/sql/sql-close-pending-connection.test.ts`) runs at file 3 and passes (5 pass, 0 fail). The same darwin aarch64 lane timed out at the tail of the suite on the prior run too, and the darwin test timeout was just bumped 40 to 45 minutes in #32194 (the commit this branch is based on), so this is darwin agent capacity rather than the change. All 285 other jobs passed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant