Skip to content

Session params#12890

Open
chase06harmon wants to merge 17 commits into
neondatabase:mainfrom
chase06harmon:session-params
Open

Session params#12890
chase06harmon wants to merge 17 commits into
neondatabase:mainfrom
chase06harmon:session-params

Conversation

@chase06harmon

Copy link
Copy Markdown

Problem

Summary of changes

chase06harmon and others added 14 commits April 12, 2026 23:38
`TerminateFilter` returns a 0-byte Ready when it intercepts the Postgres
Terminate message, which `CopyBuffer` reads as EOF. The state machine in
`transfer_one_direction` then transitions Running → ShuttingDown and calls
`poll_shutdown` on the writer. In the client→compute direction that
half-closes the compute socket, Postgres reacts to the FIN by tearing
down the backend, and the next pooled reuse hits ENOTCONN.

Add `transfer_one_direction_no_shutdown` (Running → Done directly, no
poll_shutdown) and use it for the client→compute direction inside
`copy_bidirectional_client_compute_pooled`. Compute→client keeps the
shutdown variant — if the server side really closes, we still want to
shut down the client.

Verified with 5x sequential `pg_backend_pid()` returning the same pid
and `pg_stat_activity` showing a single pooled backend.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add `--compute-endpoint-map` testing-only flag for the postgres mock
control plane: `ep-A=postgresql://...,ep-B=postgresql://...`. Thread the
endpoint id from `ComputeUserInfo` into `do_wake_compute` and consult
the map before falling through to `--compute-endpoint` (single-tenant
override) or `--auth-endpoint` (legacy fallback). The existing
single-tenant test path is unchanged.

Verified end-to-end against two local docker postgres containers
(compute-A on 5433, compute-B on 5434, each with a marker table):
- ep-A queries always land on compute-A and reuse one pooled connection
- ep-B queries always land on compute-B and reuse one pooled connection
- pids differ across endpoints; the (endpoint, db, role) pool key keeps
  them isolated even under interleaved A/B traffic
- unknown endpoint ids fall through to auth-endpoint as before

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three configurations (direct compute, proxy with pool, proxy without
pool) × three workloads (tpcb-steady, readonly-steady, readonly-short
via `pgbench -S -C`) × six concurrencies × two reps. Captures TPS
mean/std and p50/p95/p99 latency, plus auth-Postgres pg_stat_activity
samples concurrent with each `-C` run and proxy RSS at peak load.

Headline (full write-up in benchmarks/FINDINGS.md):

  - steady-state proxy overhead is small (5-25%) and proxy_pool ≈
    proxy_nopool, exactly as expected for persistent sessions
  - on the short-session `-C` workload the embedded pool delivers a
    1.3×-2.4× throughput improvement over the same proxy with pooling
    disabled
  - the remaining gap to direct on short sessions is auth-Postgres
    lookup cost, not a pool deficiency: every fresh client connection
    queries pg_authid for the SCRAM verifier. Visible as 19 transient
    EADDRNOTAVAIL errors in the proxy log at C=100 -C and motivates
    auth caching / transaction pooling as the next step.
  - at peak C=100 load, proxy_pool RSS ≈ 81 MB, proxy_nopool ≈ 78 MB

Reproduce: `cd benchmarks && ./run_bench.sh && ./summarize.py`.
README documents prereqs and override knobs (DURATION, REPS,
CONCURRENCIES_OVERRIDE, CONFIGS_OVERRIDE, APPEND_RESULTS).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add `--tcp-pool-mode=session|transaction` (default session). In
transaction mode the proxy releases the compute connection back to
the pool at every ReadyForQuery `'I'` boundary and re-acquires for
the next transaction, allowing one compute conn to serve many client
transactions over its lifetime — the PgBouncer transaction-mode
semantic.

Implementation:

  - `ReadyForQueryWatcher<R>` (proxy/src/pglb/copy_bidirectional.rs)
    parses Postgres backend protocol message headers as bytes flow
    compute → client and tracks the last-seen `'Z'` transaction status
    byte. Mirror of `TerminateFilter` on the read side; transparent
    pass-through, no message dropping.

  - `copy_bidirectional_until_boundary` pumps both directions through
    `TerminateFilter` (client → compute) and `ReadyForQueryWatcher`
    (compute → client) and returns when either reaches a boundary.
    Never calls poll_shutdown on compute.

  - `TcpPoolManager::try_acquire_idle` lets the multiplex loop
    re-acquire from the pool without a connect closure; if the pool
    is empty we cleanly close the client.

  - `proxy_pass_transaction_mode` in pglb/passthrough.rs is the outer
    loop. Each iteration: pump until boundary, then either release+
    re-acquire ('Z' I), keep held ('Z' T/E), or release-and-exit
    (Terminate while idle). Mid-transaction termination discards the
    compute conn (open BEGIN block can't be safely shared).

Verified:

  - 5 sequential psql sessions through the proxy in transaction mode
    return the same `pg_backend_pid` and the proxy log shows only
    one new compute connection opened.
  - 10 concurrent psql sessions × 3 queries each are served by 5
    distinct backend pids, with multiple sessions sharing the same
    pid — empirical multiplex.
  - Steady-state and short-session workloads continue to function
    (no regression in correctness; throughput discussed in the
    benchmarks commit).

Standard PgBouncer transaction-mode caveats apply: `SET` (without
`LOCAL`), `LISTEN`, simple `PREPARE`, temp tables, and advisory
locks do not survive across transactions, and `pg_cancel_backend(pid)`
does not work as expected because the pid the client received is
synthesized.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Run the same DURATION=10 / REPS=2 sweep against a fourth configuration:
the proxy with `--tcp-pool-mode=transaction`. Update FINDINGS.md with
section 5 covering the transaction-mode results and what they mean.

Headline: transaction mode multiplexes correctly (10 concurrent psql
sessions × 3 queries served by 5 distinct backend pids), but on
pgbench it does not surface a TPS gain over session mode. Two reasons,
both inherent to pgbench:

  - `pgbench -C` (the short-session workload) opens one transaction
    per client session, so each session only goes through one
    transaction-boundary release/re-acquire cycle. The reuse benefit
    is identical to what session mode already gets.
  - pgbench without `-C` keeps every client persistently busy with no
    idle gap; transaction mode's release window becomes pure overhead.

The benefit of transaction mode shows up on workloads with idle think
time between transactions (typical web request handlers). pgbench
does not produce that profile; characterising it on a more realistic
workload is left as future work.

Side effect: a small per-boundary release/re-acquire overhead is
visible at low concurrency on persistent-session workloads (`tpcb_steady`
at C=1: 920 → 765 TPS, ~17 %). Converges to ~equal at C=100.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Six markdown files in docs/review/ covering each logical change on
mel/pool-fixes-and-benchmarks. Designed for offline reading on a
3-hour flight with the codebase open alongside.

  00-overview.md          branch summary, reading order, commit map
  01-session-pool-fix.md  the poll_shutdown half-close bug + fix
  02-multi-endpoint.md    --compute-endpoint-map design and lookup
  03-benchmarks.md        methodology and what the numbers mean
  04-transaction-mode.md  ReadyForQueryWatcher + multiplex loop
  05-open-questions.md    weak points, things I'd challenge in review

Each per-commit doc has the same structure: TL;DR, the problem, the
fix, files changed, suggested reading order in the code, and a list
of things to push back on. The open-questions doc is the place to
read last — it changes how the rest of the docs land.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@chase06harmon chase06harmon requested a review from a team as a code owner May 2, 2026 03:02
@chase06harmon chase06harmon requested a review from awarus May 2, 2026 03:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant