Skip to content

Fix pool session accounting drift#2076

Merged
julianknutsen merged 5 commits into
mainfrom
fix/reconciler-accounting-pool-sessions
May 15, 2026
Merged

Fix pool session accounting drift#2076
julianknutsen merged 5 commits into
mainfrom
fix/reconciler-accounting-pool-sessions

Conversation

@julianknutsen
Copy link
Copy Markdown
Collaborator

Summary

  • keep pool session alias repair anchored to concrete agent identity
  • reject cross-template resume requests before they can retag a session bead
  • refuse to reassign duplicate concrete pool slots to a different alias

This PR includes the prerequisite alias-consistency patch plus the three deployed reconciler accounting fixes from the integrated branch.

Verification

  • go test ./cmd/gc -run 'TestSelectOrCreatePoolSessionBead_PrefersConcreteAgentSlotOverStalePoolMetadata|TestSelectOrCreatePoolSessionBead_DoesNotRetagDuplicateConcreteSlot|TestComputePoolDesiredStates_DoesNotResumeSessionAcrossExplicitRouteMismatch|TestExistingPoolSlotWithConfig_PrefersConcreteAgentIdentityOverStaleSlot|TestSyncSessionBeads_ClearsStaleWrongPoolAliasWhenRepairFails|TestSyncSessionBeads_.*PoolAlias|TestSelectOrCreatePoolSessionBead|TestComputePoolDesiredStates_.*Resume|TestComputePoolDesiredStates_.*InFlight' -count=1

Runtime check

  • deployed integrated binary has mismatched_aliases=0 in maintainer-city
  • rollback log noise is dominated by pending-create rollback budget deferrals from existing stale beads

@julianknutsen julianknutsen added the status/needs-review-auto PR review requested with auto approval label May 13, 2026
@github-actions github-actions Bot added the status/needs-triage Inbox — we haven't looked at it yet label May 13, 2026
@randy-release-manager randy-release-manager Bot added kind/bug Broken behavior priority/p2 Medium — real problem, workaround exists and removed status/needs-triage Inbox — we haven't looked at it yet labels May 13, 2026
@quad341 quad341 added status/reviewing and removed status/needs-review-auto PR review requested with auto approval status/reviewing labels May 14, 2026
@julianknutsen julianknutsen added status/needs-review-auto PR review requested with auto approval status/reviewing and removed status/needs-review-auto PR review requested with auto approval labels May 15, 2026
@quad341 quad341 removed the status/needs-review-auto PR review requested with auto approval label May 15, 2026
Copy link
Copy Markdown
Collaborator

@quad341 quad341 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again @julianknutsen — nothing has changed on your end. The first rebase attempt hit a transient infrastructure hiccup on our side (the fix bot dropped its connection mid-run), not anything to do with the patch. The plan is identical: I'll rebase onto fresh main, preserve all three behavior fixes and the five regression tests exactly as you wrote them, re-run your targeted selector plus the full cmd/gc package and go vet, and ship it. No action needed from you — the accounting fixes and the test coverage are dialed in.

Review coverage:

  • completed: qwen codex
  • unavailable: claude

@julianknutsen julianknutsen force-pushed the fix/reconciler-accounting-pool-sessions branch from e7783af to 66c7c34 Compare May 15, 2026 08:55
@julianknutsen
Copy link
Copy Markdown
Collaborator Author

Maintainer Adoption Review

Thanks for the contribution, @julianknutsen! This PR fixes pool session accounting drift in the reconciler, which matters because stale slot, alias, and template metadata should not cause Gas City to resume the wrong pool session or keep assigned work stuck behind inconsistent identity state.

This PR was reviewed and adopted with maintainer fixes pushed directly to the PR branch.

Original PR Review

Decision: request_changes resolved to approve.

Specific gaps fixed:

  • The review found that an out-of-bounds agent_name-derived slot could override an in-bounds pool_slot; the final patch now accepts concrete identity slots only through the guarded pool identity path and covers the regression with TestBuildDesiredState_PrefersInBoundsPoolSlotOverOutOfBoundsAgentName.
  • The review found that legacy or partial session beads could resume across an explicit route mismatch; the final patch records and compares normalized session templates and validates the mismatch path with focused ComputePoolDesiredStates regressions.
  • The review found that duplicate concrete-slot retagging, stale alias drift, and pool-create identity handling needed tighter guards; the final patch rejects duplicate concrete slots, clears stale wrong pool aliases while keeping conflict metadata, and keeps the pool-create path on the concrete identity behavior.

Review findings addressed:

  • Behavioral Correctness (major, high/codex): the final patch prevents stale concrete identity metadata from selecting an out-of-range bounded pool slot.
  • Behavioral Correctness (major, high/codex): the final patch prevents legacy or partial session beads from bypassing explicit route mismatch checks.
  • Debuggability & Operability (minor, claude+codex): the final patch leaves durable duplicate-slot conflict diagnostics as a non-gating follow-up, while allocator correctness is now guarded by duplicate-slot rejection.

Remaining non-gating notes:

  • Durable queryable diagnostics for duplicate concrete-slot conflicts and one direct reuse-path skip regression remain follow-up candidates; no remaining gating issues were found.

Maintainer Changes

Maintainer fixes were pushed directly to the PR branch as the final reviewed stack:

  • fix: keep pool session aliases consistent
  • fix: reuse pool sessions by concrete identity
  • fix: ignore cross-template pool resume requests
  • fix: reject duplicate pool slot retagging
  • fix: tighten pool session resume guards

Short diff stat: 7 files changed, 318 insertions(+), 19 deletions(-).

Final Review Status

Ready for the merge queue: final head 66c7c34 has passing required GitHub checks, and the merge-ready metadata is in place. I am marking this PR status/merge-ready so the merge queue can pick it up.

CI: https://github.com/gastownhall/gascity/actions/runs/25909252933

Review Iterations

2 review passes performed: the first pass requested changes for pool slot bounds and explicit route mismatch behavior, and the second pass validated those fixes plus the concrete identity, duplicate-slot, and alias drift guards.


Adopted via /adopt-pr workflow. Original contributor commits preserved.

@julianknutsen julianknutsen added status/merge-ready status/merge-queued Queued for deterministic PR-review merge and removed status/merge-ready labels May 15, 2026
@julianknutsen julianknutsen merged commit d9c2475 into main May 15, 2026
97 checks passed
@julianknutsen julianknutsen deleted the fix/reconciler-accounting-pool-sessions branch May 15, 2026 09:04
@julianknutsen julianknutsen removed the status/merge-queued Queued for deterministic PR-review merge label May 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/bug Broken behavior priority/p2 Medium — real problem, workaround exists

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants