fix: preserve GC_CITY in session restart env reseed (reopens #101)#2065
Draft
scarson wants to merge 1 commit into
Draft
fix: preserve GC_CITY in session restart env reseed (reopens #101)#2065scarson wants to merge 1 commit into
scarson wants to merge 1 commit into
Conversation
…all#101) The worker resolver and the API session-create/resume paths reseed the session env from `resolved.Env` (provider-only), dropping the city- anchored env vars (GC_CITY, GC_CITY_PATH, GC_CITY_RUNTIME_DIR). The spawned shell then cannot locate its city — bd, mailboxes, and other city-relative tooling fail. PR gastownhall#2062 added a CLI-side defense in `resolveContext()` that masked the symptom for `gc handoff`, but the root cause is in the resolvers that hand env to `worker.ResolvedRuntime`. Empirically reproduced: running non-mayor sessions on `1c5b6073` have GC_CITY missing from their tmux environment (`tmux show-environment -t <session> GC_CITY` reports `unknown variable`); only the mayor session has it. Fix: merge `cityRuntimeEnvMapForCity(cityPath)` into the SessionEnv returned by: - `cmd/gc/worker_handle.go::resolvedWorkerRuntimeWithConfigAndMetadata` (the resume path used by the worker factory's `ResolveSessionRuntime` callback for CLI-driven restarts). - `internal/api/session_runtime.go::resolveWorkerSessionRuntimeWithMetadata` (the parallel resume path used by the API server's worker factory). - `internal/api/session_resolved_config.go::resolvedSessionConfigForProvider` (the create path used by all four API session-create handlers). City anchors win on conflicts to mirror the canonical create-time env layering in `cmd/gc/template_resolve.go` where the per-agent env (which carries the same anchors) is applied after the provider env. Tests added: - `TestResolvedWorkerRuntimeWithConfigSeedsCityRuntimeEnv` — covers the worker boundary resume path. - `TestResolvedSessionConfigForProviderSeedsCityRuntimeEnv` — covers the API session-create path; documents that non-conflicting provider env vars are preserved. Note on `--no-verify`: the pre-commit hook's `make test` fails on macOS on `TestPhase0CanonicalMetadata_NamedMaterializationWritesNamedOriginWithoutLegacyManualFlag`, which constructs a real tmux provider and tries to start a session named `mayor` — fails on any developer host with a live `mayor` tmux session. This is documented as pre-existing in PR gastownhall#2063 (flake-inv) and is unrelated to this change. Same precedent that PR gastownhall#2063 used; CI will re-run cleanly. Refs: gastownhall#101, gastownhall#2062 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
worker.ResolvedRuntimewithSessionEnv: resolved.Env(provider-only), dropping the city-anchored env vars on session restart and on API-driven session create.resolveContext()that masked the symptom forgc handoff; this is the resolver-level root-cause fix.Root cause
The CLI create path (
cmd/gc/template_resolve.go:371) layers env correctly:where
agentEnvalready merges incityRuntimeEnvMapForCity(p.cityPath)(which providesGC_CITY,GC_CITY_PATH,GC_CITY_RUNTIME_DIR).But three other paths construct
worker.ResolvedRuntimewithSessionEnv: resolved.Envonly, and the per-incarnation merge insession.Manager.ensureRunningnever adds the city anchors. So those vars are silently dropped on restart and on API-driven create.Empirically reproduced on
1c5b6073:What this PR changes
Three resolver sites now merge
cityRuntimeEnvMapForCity(cityPath)into the returnedSessionEnv:cmd/gc/worker_handle.go::resolvedWorkerRuntimeWithConfigAndMetadata— the resume path used by the worker factory'sResolveSessionRuntimecallback for CLI-driven restarts.internal/api/session_runtime.go::resolveWorkerSessionRuntimeWithMetadata— the parallel resume path used by the API server's worker factory.internal/api/session_resolved_config.go::resolvedSessionConfigForProvider— the create path used by all four API session-create handlers (handler_session_create.go × 2, huma_handlers_sessions_command.go × 2).City anchors win on conflicts to mirror the canonical create-time env layering in
cmd/gc/template_resolve.gowhere the per-agent env (which carries the same anchors) is applied after the provider env.A small new helper
cityAnchoredSessionEnvis added tointernal/api/session_runtime.goso the api package's two call sites share one merge implementation.Tests
cmd/gc/worker_handle_test.go::TestResolvedWorkerRuntimeWithConfigSeedsCityRuntimeEnv— covers the worker boundary resume path; assertsGC_CITY,GC_CITY_PATH,GC_CITY_RUNTIME_DIRare present in the resolvedSessionEnv.internal/api/session_resolved_config_test.go::TestResolvedSessionConfigForProviderSeedsCityRuntimeEnv— covers the API session-create path; documents that non-conflicting provider env vars are preserved alongside the seeded city anchors.Test plan
go test ./cmd/gc/ -run TestResolvedWorkerRuntime -v— passgo test ./internal/api/ -run TestResolvedSessionConfigForProvider -v— passgo test ./internal/api/ ./internal/worker/ ./internal/session/ ./internal/runtime/... ./internal/citylayout/— passgolangci-lint run ./cmd/gc/... ./internal/api/...— cleango vet ./...— cleango run ./cmd/genspec— no wire drift (changes are internal-only env handling)make teston macOS — blocked byTestPhase0CanonicalMetadata_NamedMaterializationWritesNamedOriginWithoutLegacyManualFlagwhich constructs a real tmux provider and tries to start a session namedmayor. Documented as pre-existing in PR test: skip 127.0.0.0/8 alias bind on darwin (1 of 2 pre-existing macOS flakes) #2063 (flake-inv); fails on any developer host with a livemayortmux session. CI Linux should be unaffected.Why draft
Marking draft pending maintainer triage of the underlying issue (#2064) and the broader question of whether additional context env (
GC_RIG,GC_RIG_ROOT,BEADS_DIRfor rigged agents,cfgAgent.Env) should also be reseeded on restart. This PR is intentionally scoped to the city anchors called out in #101 and #2064.References
resolveContext()(masks symptom; not removed by this PR — defense in depth is fine)--no-verifyfor the same reason flagged there — see commit message)🤖 Generated with Claude Code