Skip to content

feat(mcp): full MCP-conducted Builder onboarding — iOS + android + tail, gate flipped (PR 2)#2492

Open
WcaleNieWolny wants to merge 12 commits into
wolny/mcp-tui-integrationfrom
wolny/mcp-full-onboarding
Open

feat(mcp): full MCP-conducted Builder onboarding — iOS + android + tail, gate flipped (PR 2)#2492
WcaleNieWolny wants to merge 12 commits into
wolny/mcp-tui-integrationfrom
wolny/mcp-full-onboarding

Conversation

@WcaleNieWolny

Copy link
Copy Markdown
Contributor

What

PR 2 of the MCP/TUI onboarding integration (stacked on #PR1-branch wolny/mcp-tui-integration): the MCP now conducts the COMPLETE Builder onboarding for both platforms — every journey the TUI can do is completable through the three MCP tools — and __CAPGO_MCP_ONBOARDING__ is flipped ON, shipping the tools in the release bundle (+141 KB, +4.8%).

How it's built

The MCP is a thin driver over the SAME shared headless engines the TUI uses (ios/flow.ts, android/flow.ts, tail/flow.ts) — no duplicated flow logic:

  • iOS create-new, granular — decideIos mirrors decideAndroid: total resume + auto effects + structured views; verify-app gate is multi-turn (verifyAction/verifyAppId); cert-limit revoke picker, duplicate-profile recovery, structured error screen (retry/restart/exit/email-support + corrected-key re-collect).
  • iOS import-existing — capability-gated setup fork (macOS + scan deps), identity/profile pickers, no-match recovery + Developer-Portal walkthrough, manual .mobileprovision path input, Keychain export inside one call with the user pre-warned about the macOS dialog.
  • Post-save tail, both platforms — CI secrets (gh + glab, overwrite gates, not-ready retry with re-detection), GitHub Actions workflow write behind an explicit preview consent gate ('view' returns the YAML in the result), .env export (0600, overwrite gates), package-manager/build-script pickers with lockfile recommendation.
  • Cross-call state — mcp/session-state.ts (process-local carried registry; restart degrades to engine self-heal, pinned by tests) + mcp/tail-progress.ts (slim whitelist-only disk persistence: markers + non-secret prefs; the resume routers do the rest).
  • Both data-safety gates (credentials-exist, backup/cancel) work on iOS too, including dual-platform projects.
  • Secrets discipline: carried secrets (p8/p12/passwords/CAPGO_TOKEN) never serialize into tool results or progress files — enforced by a redaction helper asserted on every test walk + verify-tree no-leak markers.

Found by the live AI e2e (fixed here)

  1. iOS credentials-exist gate bounced to platform-select on dual-platform projects.
  2. A server restart at the unanswered GitHub-Actions consent gate silently dropped the user's with-workflow choice (resume skipped the gate).
  3. The terminal build-complete result carried no evidence for the upload it claimed — the Codex judge refused to take the agent's word for it. It now surfaces the upload summary / workflow path / .env path, like the TUI screen.

Tested

  • Private suites (cli-mcp-tests, overlay): test-mcp-onboarding 87 · ios-flow 23 · tail 28 · ios-import 13 · explain-coverage 5 (authored explanation for all 63 interactive states, 90 reachable inventoried) · schema-gate 6 · session-state 14 · android-flow 111. Engine↔schema parity 43/43 keys; strict one-answer-per-call gates everywhere.
  • Hermetic e2e tree (no LLM): 20/20 paths, 429 assertions, on-disk artifact + no-leak checks.
  • LIVE Codex runs (gpt-5.5 actor + judge, isolated CODEX_HOME): 20/20 paths PASS, zero flakes — full android (keystore generate/import incl. JKS edges, SA-import + 3-way validation recovery, OAuth fire-and-poll incl. token-reject, both GCP arms, credentials gates, full build + tail), full iOS (create-new happy through the tail, verify-app wrong-build-id + autofix, cert-limit revoke, both gate arms).
  • capgo chain: full 97-segment bun run test green on the flipped bundle; tsc clean; dev-gate test now POSITIVELY asserts the 3 tools ship while dev markers stay forbidden.
  • TUI untouched: the frozen TUI E2E suite runs 79/79 journeys green against this branch's dist; zero e2e-tui files changed (requirement).

Notes for review

  • The MCP build phase keeps the C2/D2 contract (Terminal-launch/agent hand-off + BuildOutputRecord polling); the tail enters after a confirmed build. requesting-build never runs in-process.
  • Deliberate exclusions from 'every journey' (documented): ai-analysis-* (the conducting agent reads logs itself — no AI-calling-AI), self-update re-exec, native file-picker dialogs (manual path inputs instead), the TUI-only resume-prompt fork (MCP resumes silently).
  • The private submodule pin stays at the PR-1 commit (CI's TUI preview workflow is unaffected); the MCP suites + 20-path tree live at cli-mcp-tests main c3e682b and run locally via run-tests.sh.

…, carried-state registry

1. schemas/onboarding.ts: add credentialsExistChoice to onboardingNextStepSchema
   — the android credentials-exist gate was unanswerable over the real MCP
   tool path (zod silently stripped the field before the engine saw it; it
   existed only in step-input.ts + OnboardingInput). Engine↔schema parity is
   now exactly 22/22 keys.

2. Remove dead code ahead of the iOS rewiring: executeAuto's unreachable
   'android-finalize' branch + finalizeAndroidCredentials +
   'android-service-account-invalid' (call-graph-proven unreachable since
   SA-import saves via the shared tail).

3. mcp/session-state.ts (NEW): module-level carried-state registry keyed by
   appId (the oauth-session.ts pattern) holding IosEffectDeps['carried'] +
   TailEffectDeps['carried'] for the granular iOS path and the MCP tail.
   Process-local; restart loses state and consumers must degrade gracefully
   via engine self-heal; carried secrets must never serialize into tool
   results (redaction helper lives with the private tests).

Private-suite counterparts (cli-mcp-tests): test-mcp-schema-gate.mjs (6),
test-mcp-session-state.mjs (14, incl. assertNoCarriedSecrets), stale
build-first self-heal test rewritten, e2e-mcp network-fakes mode (real deps
over the e2e-tui fakes + verify-tree network smoke). All green: onboarding
87/87, android-flow 111/111, verify-tree PASS, tsc clean.
…tructured recoveries + credentials gate

decideIos is now a thin driver over ios/flow.ts mirroring decideAndroid:
getIosResumeStep resume, auto steps via runIosEffect (MAX_AUTO_STEPS loop),
interactive steps via the new mapIosView; engine step ids are MCP state
names verbatim. The TUI-only .p8 chain collapses into the tree-pinned
single 'ios-api-key' gate whose receipt applies all three fields through
applyIosInput (TUI-identical persistence) and buffers p8 bytes into the
session registry (p8ContentRef mirror; re-read from p8Path after restart).

- verify-app multi-turn: verifyAction (9-value enum incl. reopen) +
  verifyAppId schema fields; carried-driven resolver re-entry; strict
  one-answer-per-call gate (validateIosStepInput) with off-step rejection.
- cert-limit recovery: revoke picker parked in the session (per-cert
  options + __exit__), one-shot certToRevoke, re-derive on restart with
  no duplicate revoke.
- duplicate-profile recovery: delete/exit + persisted origin routing.
- error step: structured retry/restart/exit/email-support (SUPPORT_EMAIL
  newly exported; no host-side opens) + corrected-ASC-trio re-collect arm.
- iOS credentials-exist gate: Phase-0 seeding mirrors android; the gate
  intercepts BEFORE any Apple traffic (frozen TUI journey semantics), and
  the trio-as-first-call hole is closed in drive().
- finalizeIosCredentials + setIosApiKey + the coarse ios-finalize branch
  DELETED; buildIosEffectDeps wires the real apple-api/csr deps incl.
  revokeCertificate/deleteProfile via per-call JWTs.
- session-state: dropIosCarried for one-shot resolver fields.
- ios/flow.ts: additive carried type fields (existingCerts, error) only.

Private-suite counterparts: test-mcp-ios-flow.mjs (21), rewritten coarse
decideIos pins, explanations coverage pins. All green: onboarding 87/87,
ios-flow 21/21, android-flow 111/111, schema-gate 6/6, session-state
14/14, engine spot-checks (ios-create-new 36, ios-verify-app 33,
ios-recovery 19, tail-engine-shared 72), verify-tree PASS, tsc clean.
Live Codex actor runs: 3 representative tree paths, all nodes pass.
…t) + iOS import fork

S8 — tail spine: NEW mcp/tail-progress.ts persists a SLIM whitelist-only
tail progress (markers credentialsSaved/buildRequested/ciSecretsUploaded +
non-secret prefs) — the MCP driver owns marker persistence per
android/types.ts; runTailEffect itself still never saves. checkBuild
success now routes into the shared tail (detecting-ci-secrets) when the
slim progress + detection deps exist, else the legacy terminal result —
every pinned C2/D2 test green verbatim. Restart re-derivation rebuilds
tail carried (savedCredentials + entries incl. CAPGO_TOKEN) from
loadSavedCredentials into the process-local session only. Real tail deps
wired for both platforms. android tailResumeStep check moved before
keystore validation (slim files carry no keystore fields; legacy routing
byte-identical).

S9-S11 — structured interactive tail: every tail choice/input renders as
a structured MCP state with its own answer field (8 new schema fields, 43
total), strict one-answer-per-call gates, TUI-parity routing incl. the
preview consent gate before any workflow write ('view' returns the
generated YAML text in context), gh-not-ready retry re-detection, env
export with overwrite gates + 0600, session tail park mirroring the TUI's
React step state (restart falls back to the frozen tailResumeStep).

S12 — iOS import-existing over MCP: capability-gated setup-method fork
(isMacOS + scan deps wired), scan-first ordering, distribution mode,
identity/profile pickers, no-match recovery + portal walkthrough,
provide-profile-path as a manual PATH input through the engine's parse
pipeline, export-warning human_gate instructing the user about the macOS
Keychain dialog (export runs inside the call), compiling-helper
idempotency, app_store continuation through verifying-key/verify-app via
pendingVerifyNext. Memoized team-cert SHA-1 index (5-min TTL) backs
classifyCertAvailability.

Private suites: test-mcp-tail.mjs (25), test-mcp-ios-import.mjs (13).
All green: onboarding 87, ios-flow 21, tail 25, ios-import 13,
schema-gate 6, session-state 14, android-flow 111, verify-tree 13/13
paths; capgo engine spot-checks untouched (import-export 24, discovery
18, tail-engine-shared 72, ios-tail-handoff 34, android-tail-engine 49);
tsc clean; no secrets in tool results or on disk (whitelist-pinned).
… in release

build.mjs defines the gate 'true' in both bundles (PR 1 kept it 'false'
while the MCP was incomplete; PR 2 finishes it). dist/index.js grows
~141 KB (+4.8%) — the deciders + tool registrations now ship.

test-dev-gate-stripped.mjs reworked: the two onboarding tool names leave
the forbidden list and become POSITIVE assertions (all three tools —
start/next_step/explain — must be present in dist); __CAPGO_DEV__,
CAPGO_SPOOF, src/__dev__/ and the __CAPGO_MCP_ONBOARDING__ identifier
stay forbidden forever (define-replaced, never shipped).

test-mcp.mjs (the stdio smoke requiring the onboarding tools on the real
dist server) now passes — the de-facto flip gate. test-init-guardrails
gains a hermetic git config (host gpgsign=true broke the temp-repo commit
non-interactively). Full 97-segment chain green on the flipped bundle.
…consent-gate fixes the live Codex e2e found

1. Explain coverage (spec §8 Phase 5): engine exports MCP_ONLY_STATES (17)
   + MCP_UNREACHABLE_STEPS (21, typed so step renames fail tsc) — 90
   reachable states, 63 required-authored, all covered in explanations.ts
   (new: credentials-exist, build-launched, build-waiting, the 3 auto-loop
   guards; dead entries removed). resolveCurrentState maps ask-build →
   build-ready on both platforms (matches decideBuildPhase).

2. pick-package-manager: optional detectPackageManager dep threads the
   lockfile detection into the view ('recommended — matches your
   lockfile' note + context.detectedPackageManager).

3. FIX (live e2e): iOS credentials-exist gate bounced to platform-select
   on dual-platform projects — activePlatform now counts the gate-seeded
   iOS progress (_credentialsExistGate) and drive() routes the answered
   gate straight into decideIos (mirrors the android persist path).

4. FIX (live e2e): an UNANSWERED ask-github-actions-setup consent gate
   was skipped on resume — tailResumeStep (both platforms) returned
   'checking-ci-secrets' whenever ciSecretTarget was persisted, even with
   setupMode undecided; after an MCP server restart the pending
   with-workflow choice was silently lost. github + setupMode===undefined
   now re-asks the gate (GitLab unchanged — its yes/no is navigational
   and never persisted; documented asymmetry). test-android-tail-routing
   fixtures for post-gate states (confirm-secrets-push, ci-secrets-failed
   target-set) now carry the decided setupMode those states require.

Suites: onboarding 87, ios-flow 23, tail 28, ios-import 13,
explain-coverage 5, schema-gate 6, session-state 14, android-flow 111,
ios-tail-handoff 34, android-tail-routing 36, ios-resume 30,
tail-engine-shared 72, android-tail-engine 49; tsc clean; dev-gate +
flip smoke green.
…ty, judge-found)

The terminal build-complete summary was generic while the TUI's screen
surfaces what the tail actually did — so the conducting agent had no
tool-result evidence for claims like 'secrets uploaded' (the live Codex
judge correctly failed exactly that on the ios-create-happy path).

tailCompleteResult now surfaces, when present: the exact tail upload
summary, 'GitHub Actions workflow written: <path>', and the exported
.env path (with the do-not-commit warning); a durable-marker fallback
covers a session lost to a server restart. The three NON-SECRET outcome
transients (counts/labels/paths only) now park in session tailCarried at
both platforms' merge sites and are typed as the one carried subset
allowed to surface verbatim in tool results. Harvested before cleanup
wipes the session.

Live rerun: ios-create-happy 7/7 nodes PASS — full tree now 20/20 live.
@coderabbitai

coderabbitai Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 4c029721-2be8-4ff4-98eb-ff8fda97d6db

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review

Comment @coderabbitai help to get the list of available commands and usage tips.

@codspeed-hq

codspeed-hq Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Merging this PR will not alter performance

✅ 43 untouched benchmarks
⏩ 2 skipped benchmarks1


Comparing wolny/mcp-full-onboarding (1ac38c7) with wolny/mcp-tui-integration (1db3bf6)

Open in CodSpeed

Footnotes

  1. 2 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@github-actions

github-actions Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

🧪 Builder onboarding TUI preview — ✅ passed

▶ Open the interactive HTML report (zoomable journey tree + cast playback)

Commit: 1ac38c7 · Job summary with the result table

@wsyyj-050320 wsyyj-050320 left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found one security/privacy issue worth fixing before merge.

mapAndroidView() surfaces an auto-generated Android keystore password in the MCP tool result when service-account-method-select is rendered:

  • summary appends (password: ${opts.keystorePassword})
  • context includes { keystorePassword: opts.keystorePassword }

The rationale in the comment makes sense for an interactive TUI: the user has not otherwise seen the generated password. But this is an MCP result, so it is likely to be captured in agent transcripts, tool logs, CI/debug artifacts, or downstream orchestrator memory. That seems to violate the PR's own invariant that carried secrets must never serialize into MCP tool results. It also creates an inconsistent rule: manual passwords are not echoed, but generated passwords are still full secrets for release signing.

Suggested fix: keep writing the generated password to the local credentials/backup path, but do not include the raw password in summary or context. Instead, surface a non-secret instruction such as "the generated keystore password was saved in the local credentials file; back up that file now". If the UX truly requires one-time display, consider a dedicated, explicitly sensitive output channel rather than the normal MCP result payload.

Triage of all 11 findings against the current stack (PR #2394 is the
superseded pre-mega-PR; its review targets code that lives on here):

FIXED:
- [1] applyAndroidInput persists serviceAccountMethod with the package
  choice so a legacy/partial progress resumes onto the chosen route.
- [2] app-id validator requires REAL dot-separated reverse-domain ids
  (rejects single-label 'foo_bar'/'com-example'; labels must start
  alphanumeric — '-' could parse as a shell flag).
- [3] renderResult gains a defense-in-depth secret-redaction pass over
  context values by key name; the auto-generated keystorePassword human
  line stays the one sanctioned surfacing (its JSON copy is redacted).
- [5] the GCP picker's '__new__' row now routes to the
  gcp-project-create-name prompt instead of silently re-rendering.
- [7] the explain tool exposes the optional { state } its engine fn
  already accepted — agents can now explain protocol states like
  build-waiting that are not derivable from disk.
- [8] oauth-session settlement uses a detached async/await task.
- [9] writeKeystoreFile resolves the android dir from capacitor.config
  (custom android.path projects) like detection already did.
- [10] readBuildOutputRecord: null now strictly means ENOENT ('no record
  yet'); a PRESENT-but-corrupt record throws BuildRecordReadError and
  checkBuild surfaces it as a failure with the path — never 'still
  waiting' forever over a file that will never parse.

ALREADY FIXED by the PR-2 rework: [4] explicit input.platform overrides
any stale progress (engine.ts decideAdvance head); [6] the strict
android gate runs for every android input with a null-safe
resume-derived step.

DISPUTED: [0] androidStepView is not dead — it is the suite-pinned
step→view wrapper API (test-android-flow drives it directly).

Suites: onboarding 87, ios-flow 23, tail 28, ios-import 13, explain 5,
schema-gate 6, session-state 14, android-flow 111, app-id 19,
output-record 10 (4 pins deliberately flipped to the throw contract),
oauth-session 11, verify-tree 20/20; tsc clean.
PR-1 picked up main's precompiled SIGNED keychain helper (PR #2458): the
on-demand swiftc compile path is gone (no import-compiling-helper step, no
precompileSwiftHelper/isHelperCached, no carried.helperCompiled guard).

Conflict resolution (cli/build.mjs, both define blocks): keep PR-2's
__CAPGO_MCP_ONBOARDING__='true' flip AND PR-1's
__CAPGO_ALLOW_HELPER_ENV_OVERRIDE__ dev-only define.

MCP port to the merged helper contract:
- mcp/onboarding-tools.ts buildIosEffectDeps drops the
  precompileSwiftHelper/isHelperCached wiring — exportP12FromKeychain
  resolves + signature-verifies the precompiled helper internally, exactly
  what the TUI driver wires in ui/app.tsx
- mcp/explanations.ts S12 header comment drops the dead
  import-compiling-helper step id
PR-1 forward-merge counterpart: the MCP import suite now pins the signed
keychain-helper contract (resolve-failure → retryable error with install
guidance; export re-runs the full resolve+export seam) replacing the
removed compiling-helper idempotency pin; explain-coverage + session-state
drop the dead step id / carried key.
…, record hardening, gate fail-closed

Build phase:
- runBuild requires the platform progress to carry
  completedSteps.credentialsSaved (state build-not-ready) — a premature
  runBuild can no longer launch a build for an app whose credentials were
  never saved
- runBuild clears any record left by an earlier build (EngineDeps
  clearBuildRecord, production-wired to removeBuildOutputRecord) so
  checkBuild can never read last run's result as this build's
- checkBuild correlates the record's appId/platform with the polled build
  (state build-stale-record on mismatch)
- both new states inventoried in MCP_ONLY_STATES with explanations

output-record:
- defaultBuildRecordPath lives in a per-user capgo-build-records-<uid> dir
- write: parent dir 0700 (+ tmpdir ownership/symlink check), record unlinked
  before write (a planted symlink is replaced, never followed), record + QR
  png written 0600 (outputUrl is a signed download URL)
- read: lstat-first (symlink refused), strict full-shape validation for
  schemaVersion:1 records (forward-tolerant trio for future schemas)
- removeBuildOutputRecord helper (record + QR png, force)

Android driver:
- the strict step gate FAILS CLOSED at steps with no vocabulary — sending the
  first keystore field before the flow rendered ('welcome') skipped the
  credentials-exist data-safety gate seeding entirely; three resume aliases
  added for steps whose question renders while resume sits on the auto step
  (sa-json-validating, gcp-projects-loading, keystore-existing-detecting-alias)
- androidPackage '__manual__' is navigation-only: renders the manual prompt,
  never persisted as the applicationId
- invalid playDeveloperId re-renders with the exact TUI corrective instead of
  the silent same-prompt loop
- OAuth begin() and fetchUserInfo failures return structured retry gates
  instead of escaping the MCP tool handler; the settled session is KEPT on a
  fetchUserInfo failure so the retry reuses the tokens
- the generated .p12 is written 0600
Picks up the hostile-review test pins (output-record hardening, build-phase
preconditions/stale-record, fail-closed gate, OAuth containment) and the
promoted crash-recovery journeys. NOTE: c511d08 is the head of the private
wolny/hostile-review-e2e PR branch (Cap-go/cli-mcp-tests#6); re-pin to the
private main merge commit once that PR lands.
@sonarqubecloud

Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants