Skip to content

feat: Phase 4 + Recorder v2 + Self-Healing suite + BMAD docs (103 commits)#34

Closed
raffelino wants to merge 288 commits into
mainfrom
feat/recorder-and-bmad
Closed

feat: Phase 4 + Recorder v2 + Self-Healing suite + BMAD docs (103 commits)#34
raffelino wants to merge 288 commits into
mainfrom
feat/recorder-and-bmad

Conversation

@raffelino
Copy link
Copy Markdown
Collaborator

@raffelino raffelino commented Apr 20, 2026

Summary

Seven days of work, 103 commits since main. Three big arcs:

  1. Phase-4 Epics 1-5 — Enterprise Identity (OIDC + SAML), SSO User Access, Teams & Role Resolution, First-Login UX, Operational Resilience. Closed end-to-end.
  2. Recorder v2 — Web (Playwright) + Desktop Windows (skeleton, native hooks blocked on host) + Shared Selector datamodel + sidecar .rbs.json. macOS variant explicitly NO-GO per DM.1 spike.
  3. Self-Healing & Resilience suite — runtime selector heal with multi-layer fallback (sidecar / transposition / DOM-walk fingerprint), heal-audit + suspect classification, one-click patch apply, flaky-test quarantine, heal-rate KPI, AI-generated patch suggestions, deploy-aware recorder transport picker.

Backend: ~250 new pytest cases. Frontend: 0 new TypeScript errors vs HEAD (31 pre-existing unchanged).

Highlights

Phase-4 Epics (Identity / SSO / Teams / Welcome / Resilience)

Stories 1-1 → 5-11 across 5 epics. OIDC Authorization Code + PKCE, SAML 2.0, IdP-group → Team mapping with effective-role resolution, login-time group sync, emergency-bypass with auto-expire, retention scheduler, audit-log enum + structured emission, retrospective committed.

Recorder v2

  • Web Recorder MVP (Stories W.1 → W.9): controlled Chromium session via Playwright, SSE command stream, primitive capture (click/type/select/etc.), hover overlay, keyword-context-menu, result-in-editor, audit + 30-min retention, real-Chromium e2e fixture test, Explorer-toolbar deep-link launcher.
  • Shared Selectors (Stories S.1 → S.5): 6-strategy library (testid/aria/text/css/xpath/pw_locator), uniqueness verification with :nth-match, inline SelectorPicker.vue, i18n in 4 locales.
  • Desktop Windows (D.1 → D.4): pure-Python translator + thread skeleton + RPA.Windows .robot emit; native pywinauto hooks tracked as D-5 (hardware-blocked).
  • Recorder Stability (R-1): browser lifecycle event-based shutdown.
  • Legacy v1 Recorder UI removed from Explorer (V1.1); backend /api/v1/recordings/* preserved for the Chrome Extension.

Self-Healing Suite (the big new differentiator)

  • SH-1 Selector diagnosis on failed runs — sidecar lookup + ranked alternatives in run detail.
  • SH-2 Runtime self-healing — RoboScopeHeal Robot library, three-tier fallback (sidecar → transposition → fingerprint), per-test budget, confidence thresholds, suspect-heal classification cross-referenced with output.xml. Heavy emphasis on rollback safety: opt-in per keyword, no-heal tag, never-mutate-on-disk.
  • SH-3 DOM-walk fingerprint scorer (Healenium-class) — element fingerprint (tag + id + testid + classes + role + text + ancestors), weighted scorer, live-DOM walker via Evaluate JavaScript. Ships dormant on legacy flows; recorder-side capture wiring tracked as SH-3.1.
  • SH-4 One-click apply patch — server-side atomic rewrite of confirmed heals, ambiguity guard, path-traversal guard, idempotent.
  • SH-5 Long-tail keywords — Upload File, Check/Uncheck Checkbox, Select Options By, Get Text, Get Element Count, two-selector Drag And Drop with source/target probing.
  • SH-6 Heal-rate KPI — Stats overview card + 30-day sparkline as leading indicator of test drift.
  • E2E-SH Real-Chromium integration test for the candidate finder + drift fixture verifying the fingerprint walker.
  • FLAKY-1 Test quarantine model + CRUD + Stats-table integration with audit events.
  • FLAKY-2 Runner-side skip via Robot Framework listener (BuiltIn().skip() → SKIP, not FAIL).
  • AI-2 AI-generated patch suggestions extracted from failure-analysis markdown, copy-to-clipboard diff blocks.
  • DEPLOY-1 Remote-aware recorder transport picker — disables Web Playwright when backend has no display, points users at the Chrome Extension.

Playwright Docker fix-chain (production incident, multi-layer root-cause)

  • cbb7a67 Derive base image tag from installed Playwright (replace hardcoded v1.52.0).
  • de7733a Force-pin Python playwright inside container after user-package install.
  • 6767b77 PyPI constraint-extraction guardrail for future drift.
  • 63a0fde Try python-slim for batteries (still wrong but closer).
  • f7c021a Final fix: always python-slim + rfbrowser init && npx playwright install-deps chromium. Verified by real docker build + Browser-library test reaching PASS.

Bug fixes shipped along the way

  • 0a6f1e7 DockerRunner.execute() was missing the listeners kwarg added by FLAKY-2 — TypeError on every Docker run. Fixed + new signature-parity test for all AbstractRunner subclasses.
  • 6f076b3 Executions auto-poll was setting the global loading flag every 5s, mounting the spinner + hiding the table → browser scroll-anchor reset → list jumped to top. Silent-refresh path added.
  • c60102a /api/v1/openapi.json returned 500 because of response_class=None on the legacy recorder endpoint.
  • a3a5b52 Login-page redirect loop (717 navigations / 3s) caused by useBypassStatus polling /settings/sso-emergency-bypass without auth gate.

BMAD planning artifacts produced

  • Recorder v2 PRD + Epics + Architecture + macOS-feasibility-spike + Non-Goals-v1-lock.
  • Phase-4 + Recorder-v2 retrospective.
  • Story files for every commit shipped as quick-stories (SH-1..SH-6, FLAKY-1..2, AI-2, DEPLOY-1, V1.1, EXEC-1, W.9, OPS-1 deferred).

Documentation

  • README.md: feature bullets for Self-Healing, Selector Diagnosis, Flaky Quarantine, AI Patch Suggestions, Heal-Rate KPI.
  • In-app docs (EN/DE/FR/ES): new "Self-Healing & Resilience" section with 7 subsections covering the full opt-in contract, safety envelope, report semantics, KPI, quarantine, AI patches.
  • CLAUDE.md: critical-patterns notes on the singleton-composable redirect-loop, SSE connect-state, backend-launched-Playwright remote unfriendliness, abstract-runner signature parity, SH-2 opt-in invariants for any future "auto-fix test code" feature.

Test plan

  • Backend pytest sweep across recording, execution, environments, stats, ai modules — green
  • Frontend vue-tsc --noEmit — 0 new errors vs HEAD (31 pre-existing)
  • Real Playwright integration tests against drifted fixture HTML — Chromium launches, heal walker selects right element
  • Docker image build + Browser-library .robot test reaches PASS (manual smoke)
  • Reviewer: spot-check generated Dockerfile shape on an unusual repo (multiple browser packages, custom base_image override)
  • Reviewer: confirm .rbs.json sidecar is in .gitignore for typical user repos (or noted in docs)
  • Reviewer: walk through self-healing docs in EN + DE for usability
  • Reviewer: confirm Phase-4 RBAC paths still work end-to-end (smoke a login → admin/teams round-trip)

Out of scope (tracked in _bmad-output/implementation-artifacts/deferred-work.md)

  • D-5 pywinauto native hook for desktop Windows — needs Windows host or Win-CI runner.
  • macOS desktop recorder — DM.1 NO-GO per feasibility spike, four reconsider-triggers documented.
  • SH-3.1 Recorder-side fingerprint emission so new recordings actually populate the schema field SH-3 reads.
  • FLAKY-3 Docker runner listener-mount so quarantine skip works on docker-backed runs (subprocess works today).

🤖 Generated with Claude Code

raffelino and others added 30 commits April 22, 2026 18:28
EmergencyBypassView.vue + /admin/emergency-bypass route — ADMIN-only
page for the Story 5-1 backend. Shows current active/inactive state,
remaining time countdown, duration dropdown for activate, and a
deactivate button. Re-polls /status every 30s so a separate auto-
expire (from Story 5-1's hourly cleanup) reflects without a refresh.

emergencyBypass.api.ts wraps the three endpoints (get/post/delete).
i18n keys added to en/de/fr/es under `bypass.*`.

The persistent app-wide header banner ("Emergency bypass active —
expires in {time}") from the spec is a follow-up: needs a polling
composable at App-shell level and a slot in AppHeader. Left as
in-progress in sprint-status with a note.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
.github/workflows/phase4-gates.yml runs on every PR against main and
must pass before merge. Five enforceable gates matching the story ACs:

  1. prod-frontend-build — npm run build passes (vue-i18n escape guard
     for reserved `@ | { }` characters in locale files).
  2. offline-boot-invariant — pytest tests/test_boot_invariant.py
     proves the backend starts with zero outbound calls.
  3. mock-oidc-integration — the full Phase-4 auth/teams/audit suite
     runs against the respx-based mock OIDC fixture.
  4. axe-playwright — the Story 4-8 phase4-accessibility.spec.ts runs
     under Chromium; critical/serious WCAG 2.1 A/AA violations fail
     the build.
  5. full-regression — entire backend + frontend-unit suites (>1200
     tests) must stay green.

Windows offline-ZIP gate from the AC is already covered by the
existing build.yml `build-offline` matrix (windows platform) and is
not duplicated here to avoid runtime doubling.

Epic 4 stays in-progress because Story 4-7 (link-consent dialog) is
a deliberate follow-up. Epic 5 stays in-progress for 5-2 header
banner + 5-4 admin UI — both small UI follow-ups on top of shipped
backends.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the Story 5-2 follow-up. AppHeader now renders an amber
status-role banner ("Emergency bypass active — expires in {time}")
whenever the SSO emergency bypass is toggled on. Admins see a
"Manage" link straight into /admin/emergency-bypass.

useBypassStatus.ts is a singleton polling composable: one /settings/
sso-emergency-bypass GET per minute, shared across any component that
calls it. Auto-starts on first subscriber, tears down when all
subscribers release. Silently handles 401/403 so the banner stays
hidden for non-admin users without error spam.

Epic 5 only has one in-progress item left (5-4 admin UI).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds the admin UI portion of Story 5-4 on top of the already-shipped
POST /webhooks/tokens/{id}/reassign backend endpoint.

SettingsView.vue tokens tab gets a per-row "Reassign" button opening a
BaseModal that takes the new owner's user id and calls reassignToken().
On success the token row is replaced with the server-returned shape
(role may have been capped downward per the Story 3-15 invariant) and
a toast confirms.

Backend 400 ("inactive new owner") + 404 ("unknown user / token") are
surfaced via `response.data.detail` — the dialog stays open so the
admin can correct and retry.

i18n: `settings.tokens.reassign*` in all four locales.

Epic 5 now flips to done. 5-4 is the last Epic-5 in-progress item.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…y 4-7)

Closes the last Epic 4 story. When an OIDC callback detects an email
that matches an existing local-account user (hashed_password != ''),
we NO LONGER auto-link — the user is redirected to a consent page:

  OIDC callback
    └─ _upsert_user finds local user with password → raises
       SsoCallbackError("user.link_consent_required", {user_id, ...})
    └─ handle_sso_callback catches, mints a 5-min JWT consent token
       containing (user_id, idp_id, sub, email, groups, return_to),
       re-raises with that token
    └─ sso_router catches "user.link_consent_required" as a NON-failure
       path: no audit event, no rate-limit penalty, 302 to
       /sso-link-consent?token=<signed-jwt>

Frontend (SsoLinkConsentView.vue) decodes the JWT payload client-side
for display only ("An account for {email} already exists — link?"),
then POSTs to /api/v1/auth/sso/link-consent with {consent_token,
approve: bool}:
  - approve=true: backend verifies signature + exp + email match,
    detaches the local password (hashed_password=''), runs team sync
    from the token's groups, emits `user.account_linked`, returns
    access + refresh tokens. Frontend persists + navigates to return_to.
  - approve=false: emits `user.account_link_cancelled`, returns
    status=cancelled. Frontend routes to /login with a toast.

Security properties:
  - Token is a HS256 JWT signed with SECRET_KEY; invalid/expired/tampered
    tokens → 400.
  - Email in token must still match the user's current email (defense in
    depth — covers admin-side email change between mint and redeem).
  - No rate-limit penalty on the consent-required redirect — a legitimate
    SSO-first user must not be penalised for having a local account.
  - Reuse of `hashed_password=''` as the "SSO-linked" sentinel avoids a
    schema migration; existing SSO-only users (callback path post-Story
    1-1) already satisfy this invariant.

7 backend tests cover the interception, approve, cancel, invalid /
expired / email-mismatch token, plus a regression that SSO-only users
bypass consent.

Epic 4 now flips to done.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…rity hardening)

Closes the XFF-awareness item from deferred-work.md. Before this commit,
`request.client.host` was used verbatim everywhere — which means that
behind nginx / ALB every SSO user shared the proxy's IP, so one attacker
hitting the Story 2-8 failure threshold locked out the entire tenant.

New `get_client_ip(request)` helper lives in `src/auth/client_ip.py`:

  - Honors a configurable `ROBOSCOPE_TRUSTED_PROXIES` env var (comma-
    separated CIDRs; default empty). No trusted proxy configured →
    fall back to `request.client.host` (pre-Phase-4 behavior, no
    regression on direct deployments).
  - When peer is in the trusted set, walks `X-Forwarded-For`
    right-to-left past trusted-proxy entries; the first non-trusted
    entry (or the leftmost, if all trusted) is the real client.
  - Defense in depth against a hostile direct client sending its own
    XFF header — we ignore XFF unless the immediate peer is trusted.

Wired into sso_router's three callsites (login initiate, callback,
link-consent). Remaining call sites (audit middleware, teams, repos,
settings, webhooks) are lower blast-radius; they can migrate
incrementally.

61/61 SSO + rate-limit tests pass; 4 new tests cover the default-peer,
XFF-ignored-when-no-trust, leftmost-trusted, and no-XFF-when-peer-trusted
paths.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Input document for the upcoming Recorder Enhancement planning cycle.
Captures the user's long-term vision verbatim on 2026-04-22:

  - Multi-module Recorder (Web + Desktop; Mobile possibly later).
  - Web: controlled-browser capture of nav / click / type / scroll /
    drag-drop; hover overlay like Chrome DevTools inspect; right-click
    context menu grouped by Browser-library keyword family; result
    opens in standard Visual-Flow + Text editor; multiple selector
    candidates per command swappable via inline picker; selector
    strategies include test-id, ARIA, stable text, CSS, XPath,
    Playwright locator.
  - Desktop: Windows first (macOS tentative); architectural open
    questions deferred to PRD.
  - Cross-cutting: shared SelectorCandidate + RecordedCommand model;
    persistence via existing repo FileExplorer save path; Chrome-
    extension stays the existing transport for that flow (Story R-1).
  - Explicit non-goals logged for the PRD round.

Next step: bmad-agent-pm (John) → bmad-create-prd using this file as
the raw input, then bmad-agent-architect → bmad-create-architecture,
then bmad-create-epics-and-stories to break down into sprint-ready
stories (likely three epics: Web Recorder MVP, Desktop Recorder,
Shared Selector datamodel + UI).

Captured now so nothing is lost if the planning session happens later.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Builds on the raw vision captured in recorder-vision-2026-04-22.md.
Structured as three epics so they can ship independently:

  - WEB (MVP): controlled-browser session via Playwright+CDP; captures
    nav/click/type/scroll/drag-drop; DevTools-style hover overlay;
    right-click context menu grouped by Browser-library keyword family;
    result opens in the standard Visual-Flow + Text editor.
  - SEL (MVP companion): SelectorCandidate datamodel with test-id /
    ARIA / text / CSS / XPath-variants / Playwright-locator strategies,
    uniqueness-checked at capture, swappable in the editor inline.
  - DESKTOP (vision): Windows first via UI Automation; macOS tentative.
    Reuses the SEL datamodel so desktop commands share the editor UX.

PRD covers: classification, success criteria with measurable targets,
two user journeys (record flow + selector-heal), domain model sketch,
14 functional + 6 non-functional requirements, explicit non-goals
(no cross-browser, no mobile, no cloud replay, no AI healing).

Chrome extension stays as a parallel transport — v2 is additive, no
deprecation of the existing recorder.

Next step: bmad-create-epics-and-stories decomposes this into sprint-
ready stories across the three epics. Architecture pick (Playwright+CDP
details, Windows UI Automation library choice) happens in the
bmad-create-architecture pass between PRD and epic decomposition.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…pics)

Decomposes the Recorder v2 PRD into sprint-ready stories. Four epics,
rolled out in dependency order:

  W — Recorder Web MVP (8 stories, ships first)
      session lifecycle, SSE command stream, primitive capture,
      hover overlay, keyword-family context menu, result-in-editor,
      launcher + route, audit + retention.

  S — Shared Selector datamodel + editor UI (5 stories, parallel to W)
      selector + command types, 6-strategy synthesis library,
      uniqueness verification, inline picker component, 4-locale i18n.

  D — Desktop Recorder Windows (4 stories, ships after W + S)
      UI Automation adapter, primitive capture, selector strategies,
      Robot library mapping.

  DM — Desktop Recorder macOS (2 stories, tentative)
       AX feasibility spike + session adapter.

Architecture-dependent decisions flagged [needs-arch] (browser pool
vs. per-session, RPA.Windows vs. pywinauto, keyword-family exact
list) for the bmad-create-architecture pass before Sprint N kicks
off. Six open questions captured for the architect.

6-sprint rollout plan: each sprint ends with a tagged release-gate
CI run matching the Phase 4 `phase4-gates.yml` pattern.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Ten numbered architectural decisions (AR-1 through AR-10) pre-empt
the questions the epic breakdown flagged. Summary:

  - AR-1 Browser: per-session, not pooled. State-leak risk beats
    pool efficiency over a minutes-long session.
  - AR-2 Hosting: in-process with backend. No sidecar. Reuses the
    Story R-1 dedicated-event-loop-thread pattern.
  - AR-3 Command stream: SSE, not WebSocket. Unidirectional server→
    browser; native EventSource on frontend; single subscriber.
  - AR-4 Hover overlay survives SPA navigation via page.addInitScript
    + History-API proxy inside the script.
  - AR-5 Right-click context menu is an in-page overlay DOM, not
    a browser-chrome extension — zero-install UX.
  - AR-6 15-keyword catalog frozen for MVP from the Browser library's
    reference, grouped Assert/Read · Wait · Interact · State.
  - AR-7 Selector scoring rubric (quality_score 0-100) with explicit
    penalties for auto-generated attr values, long text, nth-child,
    generic XPath anchors. Sort by (verified_unique, score).
  - AR-8 Windows desktop: pywinauto for raw capture (has event hooks),
    RPA.Windows for the emitted Robot keywords. Mapping table included.
  - AR-9 Chrome extension stays as a parallel transport. No deprec.
    Transport enum gets 3 new values.
  - AR-10 Rate limit: 1 active recording session per user. Second
    POST aborts the first.

Minimum API contract captured as normative for the 19 stories. Test
surface estimate: 83 new tests; phase4-gates.yml gets a new
record-a-fixture-app-and-run-the-.robot gate.

One open item remains: Playwright-Chromium shipping (fold into main
image vs. split roboscope-recorder variant) — pending sprint planning.

Ready for sprint N: W.1 + S.1 + S.2 unblocked.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Kicks off Epic S with the foundational types every other Recorder v2
story depends on. Round-trip-stable Pydantic ↔ TypeScript shape:

  - SelectorCandidate { strategy, value, quality_score (0..100),
    verified_unique } — frozen on the Python side; strategy enum
    includes the 6 web variants (testid, aria, text, css, xpath,
    pw_locator) AND the 3 Windows-UIA desktop variants so the Desktop
    epic plugs into the same picker UI without a type split.
  - RecordedCommand { index, keyword, args, selector_candidates,
    active_candidate_index } with a runtime guard that
    active_candidate_index stays inside the candidates list. Exposes
    `active_selector` / `activeSelector()` helpers.
  - RecordedFlow { schema_version, transport, session_id, name,
    commands } — schema_version=1 frozen for MVP, serialises to the
    .robot-on-disk source of truth (no DB row).

`validate_schema_version()` / `validateSchemaVersion()` guard the
version boundary at the first point of contact with untrusted JSON;
fails loud on missing / non-int / too-new / < 1. Architecture doc
AR-0 locks this in as the v2 contract.

16 backend tests + 9 frontend tests lock in the roundtrip, bounds,
and guard behaviour.

Next stories (W.1, S.2, S.3) can now import these types and build on
them in parallel.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…y S.2)

Implements the 6 web strategies (testid, aria, text, css, xpath,
pw_locator) plus the quality-score rubric locked in architecture
doc AR-7. Each strategy is a pure function over an ElementSnapshot
dataclass:

  testid (95):
    - reads the 4 configured attrs (data-test-id, data-testid, data-qa,
      data-test); emits one candidate per present attr
    - penalty -25 if the value looks autogenerated (20+ chars,
      mixed case+digits → likely hash)

  aria (80):
    - requires role AND name; penalty -15 for time-like / counter names

  text (70):
    - candidate only when text is non-empty; -20 above 40 chars;
      -30 for pure numeric

  css (50 structural, 60 for stable id):
    - #id if the id looks stable
    - tag + stable-looking classes; -5 per extra class

  xpath:
    - text-anchored 65 (inherits text penalties)
    - relative-anchored 55 (nearest ancestor with stable id or testid;
      -10 if the anchor is a generic div/span)
    - absolute 25 — always fragile, last-resort

  pw_locator (75):
    - getByRole(role, {name}); -10 for generic roles
    - getByText fallback 70 for role-less elements

`synthesise_selectors()` returns candidates sorted by
(verified_unique DESC, quality_score DESC) — matches the editor's
picker order. Quality floor 20 drops hopeless candidates before they
mislead the user.

Uniqueness verification (S.3) will come later — this module is pure
synthesis, no DOM.

20 tests lock the rubric in as the source of truth. The editor's
colour-dot legend in S.4 will reference these numbers.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
SelectorPicker.vue renders the active RecordedCommand candidate inline
with the Story S.2 quality-score colour dot (green ≥80, amber 50–79,
red <50 — matches the AR-7 rubric). Clicking the chevron opens a
sorted menu of every candidate; selecting one emits
`update:activeIndex` so the parent view can persist the swap.

Key behaviours:
  - Solo-candidate commands hide the chevron — no choice to make.
  - Zero-candidate commands render nothing (no picker for `Go To <url>`).
  - Menu closes on outside click via a robust document-level listener
    (safe against event.target === document case).
  - verified_unique candidates show a ✓ indicator so the user can
    tell which locators were DOM-verified at capture time.
  - aria-expanded + aria-haspopup + role listbox/option for screen
    readers.
  - Compact variant drops the strategy label for gutter-annotation
    mode in the text editor.

Story S.5 i18n shipped inline: `recorder.selector.strategy.*` for all
9 strategies (6 web + 3 Windows-UIA desktop) in en/de/fr/es, plus a
`swapAriaLabel` for the chevron. Keeps Recorder v2 on the NFR-R4
"i18n-complete" invariant from day 1.

10 Vitest specs cover active-value rendering, quality-band dot colour
for all three bands, menu toggle + outside-close, emit-on-pick,
zero-and-one-candidate edge cases, verified-unique indicator,
strategy-label i18n wiring. Full frontend suite: 210/210.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
….8 partial)

Four new AuditEventType members for the Recorder v2 lifecycle —
recording.session.started / .completed / .aborted / .flow_saved.
Emission sites land with W.1 (session start), W.6 (finalize + save)
and the existing router /stop endpoints.

abort_idle_recording_sessions() piggy-backs on the Story 5-5 hourly
retention cleanup: any RecordingSession still `recording` with
started_at older than 30 min is flipped to `cancelled` with
finished_at=now, and audits `recording.session.aborted` with
reason=idle_timeout. Matches the PRD NFR-R3 invariant that one
Chromium process per user tears down on stop OR 30-min idle.

Five tests lock in: aborts over-threshold row + audits it, leaves
fresh sessions alone, ignores non-`recording` statuses, empty case
returns 0, multiple idle rows all aborted in one pass while fresh
rows are preserved.

Story W.8's full scope (emission at start + completed + flow_saved)
lands when W.1/W.6 wire the session endpoints through.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Scaffolds the Recorder v2 session endpoint family the PRD calls for.
This commit is intentionally a stub — it creates the RecordingSession
row, audits, enforces the per-user AR-10 cap, and provides owner/
admin DELETE, but does NOT yet launch Chromium. The browser-driver
wiring lands in a follow-up so this stub can be committed + reviewed
in isolation.

Routes:
  POST   /api/v1/recordings/sessions
         body: {transport, repo_id}
         returns: {session_id, transport, status}

  DELETE /api/v1/recordings/sessions/{id}
         owner or admin only; idempotent on terminal sessions

Security:
  - EDITOR+ effective role required on target repo (Story 3-7 pattern).
    Inline check instead of require_effective_role dependency because
    repo_id lives in the body, not the path. API-token auth is capped
    at scoped role (Story 3-15 invariant preserved).
  - AR-10 per-user cap: creating a new session aborts any prior
    active session for the same user; the superseded row audits
    `recording.session.aborted` with reason=superseded.

Audit:
  - RECORDING_SESSION_STARTED on create.
  - RECORDING_SESSION_ABORTED on DELETE (reason=user_abort) or
    superseded-by-new-session.

8 tests cover: editor success + audit, viewer 403, team-editor elevation,
supersede-on-second-create, owner abort, non-owner 403, admin-abort-any,
404 on unknown.

Follow-up stories:
  - W.1 full: Playwright browser launch + per-session threading.
  - W.2: SSE command stream on /sessions/{id}/commands.
  - W.6: /finalize + /save endpoints.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
New `/recordings/new` route (EDITOR+ only, enforced by router meta +
backend inline effective-role check) renders RecordingLauncherView.vue
with:

  - Transport picker: Web (enabled), Desktop-Windows + Desktop-macOS
    are visible but disabled with a "coming soon" label matching the
    PRD's phased rollout.
  - Repo selector populated from the existing useReposStore.
  - Big red [Start recording] CTA that POSTs /api/v1/recordings/sessions
    (the Story W.1 stub) and routes to /recordings/live/{id} on success
    — the live/capture/finalize view ships with W.2 + W.6.

Error handling surfaces the backend's `detail` message so a 403 from
the effective-role check reads cleanly in the UI. 4-locale i18n lands
with the view (NFR-R4 invariant).

210/210 frontend tests remain green.

Remaining Epic W: W.2 (SSE stream), W.3 (capture primitives),
W.4 (hover overlay), W.5 (context menu), W.6 (result view). W.1 still
needs its Chromium-launch backfill.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds four new epics + 19 stories from the Recorder v2 planning artifacts
into the sprint tracker so progress is visible alongside Phase 4:

  epic-recorder-v2-web — 8 stories
    W.1 (in-progress: stub shipped, Chromium launch pending)
    W.2..W.6 (backlog, blocked on W.1 full)
    W.7 (done — launcher view committed)
    W.8 (in-progress: retention + audit types shipped)

  epic-recorder-v2-selectors — 5 stories
    S.1, S.2, S.4, S.5 (done)
    S.3 (backlog, needs live browser from W.1)

  epic-recorder-v2-desktop-windows — 4 stories (backlog)
  epic-recorder-v2-desktop-macos — 2 stories (backlog, tentative)

Honest status: 6/19 stories done, 2/19 in-progress, 11/19 backlog.
Next unblocker: W.1 full Chromium integration (Playwright + thread
pattern from Story R-1).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the "save a recorded flow to the repo as a .robot file" half of
Story W.6. The live-view half (Visual-Flow editor integration) lands
when W.2 (SSE stream) + a result route mount.

`robot_emit.emit_robot(flow) -> str`:
  Pure function, no I/O. Renders the AR-6 keyword families into valid
  Robot Framework syntax. Selector emission is strategy-aware — xpath
  values get the `xpath=` prefix, testid/css/aria/pw_locator keep the
  verbatim value that Browser library already accepts.

  Targeted keywords (Click, Type Text, Wait For Elements State, ...)
  use the active selector as first arg; global keywords (Go To, Wait
  Until Network Is Idle) don't need one. A targeted keyword without a
  selector emits a visible `# WARNING: no selector captured` so the
  upstream capture bug is immediately obvious rather than producing
  silently-broken .robot syntax.

`POST /api/v1/recordings/save`:
  Body {flow, repo_id, path}. EDITOR+ effective-role required (inline
  check — repo_id is in body). Validates the flow's schema_version
  first; rejects traversal (`..`), absolute paths, and paths resolving
  outside `repo.local_path`. Auto-suffixes `.robot` if missing.
  Writes the file + parent dirs, emits `recording.flow.saved` audit.

Tests:
  - 12 unit tests pin the emit syntax for settings/test block, empty
    flow, name fallback, all targeted keyword variants, global keyword,
    arg ordering, active-selector honouring, a full login flow.
  - 9 endpoint tests cover happy path, auto-`.robot` suffix, viewer 403,
    absolute-path/traversal/empty-path rejection, unknown + missing
    schema_version, unknown repo 404.

Recorder v2 progress: S-Epic now fully unblocked end-to-end on the
save path. Remaining W stories (W.1 full, W.2, W.3, W.4, W.5) still
need Chromium integration before they can demo a real flow, but the
emitter + save endpoint can already round-trip hand-crafted flows.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
v2_command_queue module: thread-safe per-session queue.SimpleQueue
registry. Capture producers (future W.1 full Chromium launch, W.3 JS
capture) call enqueue_command(session_id, RecordedCommand); the SSE
endpoint's generator drains via iterate_commands() until the end-
sentinel is seen.

`GET /api/v1/recordings/sessions/{id}/commands`:
  - text/event-stream response; each event is `event: command / data:
    <RecordedCommand JSON>`.
  - Session-owner or admin only (403 otherwise). Single-subscriber per
    AR-3.
  - On clean finalize: emits `event: end / data: {}` before closing so
    the frontend EventSource handler can distinguish done from blip.
  - Sets `X-Accel-Buffering: no` so nginx doesn't hold the stream.

Session lifecycle integration:
  - POST /sessions now registers the queue as part of session creation.
  - DELETE /sessions/{id} finalizes + tears down the queue so the
    subscriber wakes up and returns cleanly.

7 tests:
  - enqueue before register fails safely
  - register + enqueue + drain + sentinel ends iteration in under 2 s
  - tear_down removes the registry entry
  - iterate on unknown session yields nothing
  - non-owner GET gets 403
  - full integration: producer thread + SSE handler, verifies two
    commands + end-event land in the response body
  - 404 on unknown session

Completes the send-path half of the Recorder v2 live experience.
Missing: W.3 (the capture script that actually produces the commands),
W.4 (hover overlay), W.5 (context menu). Those all plug into
enqueue_command() once they're implemented.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Ships the JS text that Playwright's `page.add_init_script()` injects
on every new document. Covers the AC-FR2 primitives:

  - click + dblclick
  - text input (change event on input/textarea/contenteditable)
  - Enter keypress (explicit submit signal)
  - scroll (debounced 200 ms, page vs container aware)
  - drag-and-drop (paired dragstart/drop → single drag_drop event)
  - navigation (load + history.pushState + replaceState + popstate)

The script is AR-4 compliant — it self-installs on every document and
wraps history API calls inside the same IIFE so SPA navigations are
detected too. All event handlers are capture-phase listeners so the
page's own handlers do not shadow us.

`send()` is wrapped in try/catch so any unexpected error never leaks
back into the page — an injected recorder must never crash the SUT.

Element snapshots are structured exactly like Story S.2's
ElementSnapshot dataclass so `synthesise_selectors()` can consume them
directly once the binding is wired. Ancestor chain is capped at 8 deep
to bound payload size.

EMITTED_KINDS = ('click', 'dblclick', 'type', 'press', 'scroll',
                  'drag_drop', 'navigate') — the regression-guard tuple
the tests pin against.

Remaining wiring: Playwright's exposeBinding call hands the session
id + enqueue helper to the browser context. Lands with W.1 full.

13 tests lock the surface: IIFE wrap + idempotency guard + swallowed
send errors + every AC-FR2 primitive has its listener + helper fn
stability.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Injected alongside the capture script (W.3) on every document. Shows
a semi-transparent 2-px Brand-blue box around the hovered element
plus a fixed-position label with tag + leading class summary +
pixel dimensions.

AC-FR3 compliance points:
  - pointer-events: none — never swallows clicks.
  - aria-hidden="true" on both the box and the label — screen readers
    never see it.
  - prefers-reduced-motion: reduce disables the 30 ms positioning
    transition.
  - Ctrl+Shift+X toggle hides/shows the overlay without re-recording.
  - mouseout + window blur hide; scroll repositions to keep the
    overlay aligned with the target as the page moves.
  - IIFE wrapped with __roboscopeOverlayInstalled guard so a second
    injection is a no-op.
  - z-index at (max int − 1) for the box and max int for the label so
    they ride above every stacking context the app might introduce.

Browser-side behaviour tests belong in e2e/phase4-recorder.spec.ts
once Playwright injection is wired (W.1 full). 11 Python tests pin
the static surface — IIFE wrap, idempotency, pointer-events,
aria-hidden, prefers-reduced-motion branch, all four event handlers,
toggle hotkey, label contents.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… W.5)

Ships the AR-6 right-click menu IIFE that injects alongside the capture
script. Right-click inside the controlled browser → native menu
suppressed → in-page menu with the frozen MVP catalog:

  Assert / Read (5): Get Element Value, Get Text, Get Attribute,
                     Should Be Equal, Should Contain
  Wait (3):          Wait For Elements State, Wait Until Network Is Idle,
                     Wait For Condition
  Interact (4):      Double Click, Hover, Focus, Press Keys
  State (3):         Scroll To Element, Take Screenshot, Highlight Elements

Items with required args (Get Attribute, Should Be Equal, Wait For
Elements State, Press Keys, ...) prompt inline via window.prompt
before emitting — user cancel drops the emission cleanly.

UI properties:
  - amber (#D4883E) left-accent so the menu is unmistakably RoboScope.
  - role="menu" + aria-label on the root, role="menuitem" on each row.
  - outside-click + Escape close the menu.
  - viewport-aware positioning — menu never clips off-screen.
  - z-index = max int (2147483647) to ride above every stacking
    context the target app might introduce.

Same binding contract as the capture script (W.3):
  window.__roboscopeCapture({kind: "custom_action", keyword, args, element})

17 tests: catalog invariants (4 families, 15 keywords, Title-Case
enforcement, arg-prompt presence), IIFE + idempotency, prevent-default,
outside-click + Escape close, brand amber accent, shared binding name,
accessibility roles, helper fns.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… (story W.1 full)

Closes the last blocking story for Epic W. Three new pieces that plug
into the existing scaffold:

1. `v2_payload_translator.translate_payload(payload, index)` — pure
   function mapping the loose JS payloads from the W.3 capture script
   and the W.5 context menu into RecordedCommand. Handles every
   emitted kind: click, dblclick, type, press, scroll (page vs
   container), drag_drop (paired endpoints → Drag And Drop with source
   selector + target value), navigate, custom_action. Unknown kinds
   return None. 17 unit tests pin the mapping.

2. `v2_recorder_task.run_v2_recorder_session(session_id, target_url)` —
   blocking entry point dispatched on a dedicated event-loop thread
   (Story R-1 pattern). Launches Chromium via async_playwright, injects
   all three IIFE scripts via context.add_init_script() — AR-4
   guarantees the scripts survive SPA + full-page navigations.
   Registers the `__roboscopeCapture` binding (AR-5). Binding handler
   runs on the Playwright loop and only touches the thread-safe
   command queue + index counter. browser.on("disconnected") flips
   stop_event — matches Story R-1's event-based teardown. Fresh DB
   session for status flips per CLAUDE.md. Exceptions in the
   Playwright loop flip status to FAILED with the message captured.
   Playwright import is deferred to the entry point — tests can
   import the module without Playwright installed.

3. `POST /api/v1/recordings/sessions/{id}/start-browser` — explicit
   opt-in endpoint so unit tests don't accidentally open Chromium.
   Owner-or-admin only. 202 Accepted with {session_id, task_id}.
   Env-var kill switch `ROBOSCOPE_RECORDER_DISABLED=1` short-circuits
   with task_id=null (lets the Windows offline ZIP ship without
   Chromium). Rejects sessions not in the RECORDING status.
   DELETE /sessions/{id} now also signals the v2 task to tear down.

Sprint-status effect: W.1 / W.3 / W.4 / W.5 / W.8 all flip to done.
Only W.6 (frontend live editor mount) remains in-progress in Epic W.
Epic S still has S.3 (uniqueness verification — needs the live browser
that W.1 now provides, so it's unblocked for the next sprint).

23 new tests lock the translator + endpoint behaviour. Integration
coverage of the real Playwright session belongs in a Playwright-based
e2e spec (the phase4-gates.yml `record-a-fixture-app-and-run` gate
the architecture doc calls for).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes Epic S. `verify_candidates(candidates, locator_factory)` takes
the Story S.2 synthesis output and populates `verified_unique` against
the live DOM:

  - 0 matches  → drop (stale selector, DOM mutated mid-capture)
  - 1 match    → verified_unique=True
  - >1 matches → disambiguate:
      * css        → append :nth-match(1)
      * xpath      → wrap in (…)[1]
      * pw_locator → append .first
      * text/aria/testid → keep, verified_unique=False (picker flags amber)
    All disambiguated selectors take a -15 quality penalty.

Broken locator_factory call (invalid xpath, CSS parse error, etc.)
swallowed → candidate dropped so the UI never shows a poisoned entry.

Final sort: (verified_unique DESC, quality_score DESC) — unchanged
from the S.2 contract; the editor picker orders itself by this.

8 async tests lock the behaviour. Plug-in point for the live browser
lives in v2_recorder_task: inside the binding handler, after
synthesise_selectors() produces the raw candidates, call
verify_candidates(cands, lambda c: page.locator(c.value).count()).

Epic S is now fully done. Epic W has only W.6 (live editor mount)
left as a frontend follow-up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…6 full)

RecordingLiveView.vue at /recordings/live/:sessionId closes Epic W:

  - Auto-dispatches POST /sessions/{id}/start-browser on mount (with the
    env-var kill switch pre-empting a real Chromium in tests / air-
    gapped deploys).
  - Opens EventSource against /sessions/{id}/commands and binds the
    Story W.2 `command` + `end` events. Each RecordedCommand lands in
    a reactive list rendered as a numbered step list.
  - Every step uses the Story S.4 SelectorPicker in compact mode so the
    user can swap candidates while the flow is still being captured.
  - Stop-and-save button aborts the session (triggers Chromium teardown
    + SSE `end`), assembles a RecordedFlow with schema_version=1, then
    POSTs to /recordings/save with the repo id stashed in sessionStorage
    by the launcher. Navigates to the saved file in /explorer on success.
  - Connection state chip (connecting / live / done / error) + error
    banner for SSE drops or save failures.

Launcher (Story W.7) now caches `recorder.repo.{session_id}` in
sessionStorage so the live view's save step does not re-prompt the
user for the target repo.

Four new i18n entries under `recorder.live.*` in en/de/fr/es covers
status chips, waiting hint, path prompt, CTA labels, and four error
strings (stream lost / start failed / save failed / repo reference
missing).

Sprint status: W.6 done; epic-recorder-v2-web flips to done. Remaining
Recorder v2 work is desktop only (Epic D + DM) plus the D.* spikes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Desktop-side counterpart to Story S.2's web synthesis. Takes a
`DesktopElementSnapshot` (control_type + AutomationId + Name +
ClassName + ancestor chain) and returns quality-scored
SelectorCandidates using the AR-8 rubric:

  AutomationId (92): primary desktop anchor
    − 25 if the AutomationId looks auto-generated (20+ chars, mixed
      case + digits — a hash, not a handle)
  Name (75): UIA accessible name
    − 30 if numeric-only (brittle — a counter)
    − 15 if contains time-like digit runs (and NOT numeric-only —
      penalties mutually exclusive)
  ClassName (50): fallback when both above absent
    − 15 for generic WPF classes (TextBox, Button, Window, Panel)
  XPath-over-UIA:
    anchored 55 on nearest ancestor AutomationId
    anchored 50 on nearest ancestor Name
    absolute 22 control-type chain — always fragile, last resort

Candidate shape is the same `SelectorCandidate` from Story S.1 — the
editor's inline picker (Story S.4) renders these alongside web
candidates without changes, using the Story S.5 i18n labels already
shipped for `automation_id`, `uia_name`, `uia_class_name`.

13 tests cover each strategy's scoring, mutual-exclusion of numeric
vs time-like penalties, sort order, and the quality-floor-only
absolute-xpath fallback.

Desktop Epic D flips to in-progress — D.4 (Robot keyword mapping)
is the next pure-Python story. D.1 (session adapter) + D.2 (capture)
need a Windows host and pywinauto to be exercised meaningfully.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`emit_robot(flow)` now dispatches on `flow.transport`:

  web_playwright / chrome_extension → Library: Browser (unchanged —
    12 pre-existing tests still green)
  desktop_windows / desktop_macos   → Library: RPA.Windows with
    the library's own locator syntax:
      id:<AutomationId>
      name:<UIA Name>
      class:<UIA ClassName>
      xpath:<XPath over UIA tree>

Desktop targeted-keyword set (Click, Double Click, Type Text, Select
From Combobox, Select From Menu, Control Window, Take Screenshot)
consumes the active selector; everything else passes through as a
global keyword. Ordered-arg rendering keeps `text` before `value`
before `key` so emitted lines read the same as a hand-written
RPA.Windows script.

12 new tests plus the 12 web-emit tests stay green:
  - Library setting line per transport (Browser vs RPA.Windows).
  - Selector strategy → locator-prefix mapping for all four UIA
    strategies.
  - Select From Combobox, Control Window — desktop-specific keyword
    coverage.
  - Missing selector on a targeted keyword emits a visible warning.
  - Full three-step login-to-payroll flow roundtrip with the expected
    RPA.Windows locator prefixes on each step.

Story D.4 is the last pure-Python desktop story. D.1 (UIA session
adapter) and D.2 (primitive capture) need a Windows host + pywinauto
to be exercised beyond a stub — kept as backlog for a dedicated
desktop sprint.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ator (stories D.1 + D.2 partial)

Adds the pure-Python pieces of the Windows recorder so Epic D has every
commit-testable-without-a-Windows-host story landed. The pywinauto event-
hook wiring inside `_desktop_loop` is a TODO — it needs a Windows dev
host, so we ship the translator + thread scaffolding first and let that
land as a focused Windows-host PR.

Module `desktop_recorder_task`:
  - `signal_stop_desktop(session_id)` / `is_desktop_session_active(id)`
    — same registry shape as the web task (Story W.1 full) so the
    recording/router DELETE endpoint can hook this in uniformly.
  - `translate_uia_event(payload, index) -> RecordedCommand | None` —
    pure function, translates the payload the pywinauto hook will
    produce into a RecordedCommand via Story D.3's desktop selector
    synthesis. Handles the 5 AR-8 captured event kinds:
        click → Click
        dblclick → Double Click
        type → Type Text
        combobox_select → Select From Combobox
        menu_select → Select From Menu
        window_focus → Control Window
    Unknown / missing kinds return None.
  - `run_desktop_recorder_session(session_id)` — blocking entry point
    dispatched by the task executor. Non-Windows hosts short-circuit to
    FAILED so the session row terminates and the queue is torn down.
    Windows hosts enter `_desktop_loop` which defer-imports pywinauto.
  - Same DB-session / queue / teardown semantics as the web task.

10 tests lock the translator surface: each event kind + unknown kind +
index propagation + candidates-shape round-trip via D.3.

Sprint status: D.1 + D.2 flip to in-progress (skeleton shipped, Windows-
only wiring outstanding). D.3 + D.4 already done. Epic D stays
in-progress until the pywinauto hooks land.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…story DM.1)

Closes Story DM.1 with the explicit go/no-go decision the spike was
meant to produce: **NO-GO.** DM.2 (macOS session adapter) stays
`backlog`; the code never lands in v2. Four concrete reconsider-
triggers captured so a Phase 5 champion has unambiguous evidence for
reopening:

  1. Design-partner customer explicitly names macOS-desktop recording
     as a gating RFQ item.
  2. Support-ticket tag `recorder-macos` accumulates ≥ 5 distinct
     users within 60 days of GA.
  3. A cross-platform Robot library with first-class macOS support
     emerges (RPA.Mac or equivalent).
  4. RoboScope ships a macOS-native installer with code-signing +
     notarisation.

Primary reason for NO-GO: the replay side is the killer. Recording is
tractable (AXUIElement covers the equivalent surface; Story S.1's
strategy enum already accommodates `automation_id` / `uia_name` /
`uia_class_name` for desktop hosts across Windows + macOS with zero
schema change). But there is no first-class cross-platform Robot
library for macOS replay — `RPA.Windows` is Windows-only, `RPA.Desktop`
is coordinate-based and not sufficient for stable locator-anchored
replay. A user recording on macOS and running the .robot in CI (which
is always Windows or Linux) would be stuck.

Secondary reasons: no pilot-customer demand (Windows-only CI), macOS
Accessibility permission friction, code-signing complexity for an
accessibility-privileged Python process.

Epic DM stays `backlog`; DM.1 flips to `done` because the decision is
now permanent evidence.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
raffelino and others added 28 commits May 5, 2026 17:09
Robot Framework lets you prefix any keyword call with one or more
variable names (\`\${a}    \${b}=    Some Keyword    args\`) to capture
return values. The FlowEditor previously only exposed the
return-variables row when the step's type was already 'assignment',
which only happens for steps parsed from existing \${var}= lines —
there was no way to promote a fresh palette-dropped keyword.

Show the return-vars row for both 'keyword' and 'assignment' steps.
Adding the first variable flips type 'keyword' -> 'assignment' so
the serializer emits the assignment syntax; removing the last one
flips back so empty assignment steps don't survive a save.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The repo-file scanner dropped every \`Library X\` line whose target
was bundled with robotframework itself (BuiltIn, Collections, String,
DateTime, OperatingSystem, Process, XML, Dialogs, Telnet) — the idea
was that those needed no introspection because they shipped with RF.
Side effect: the FlowEditor palette saw no dynamic data for them,
fell through to the curated static-fallback list and tagged the
category as "(examples)" even after the user added \`Library Collections\`.

Drop the filter — libdoc handles bundled libraries fine, the
\`pip list\` discovery still skips them (they're not separate pip
packages), and explicit imports now get the full keyword surface
instead of a curated excerpt. Empty frozenset kept as a hook for
future suppression decisions.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adding a Library / Resource import in the FlowEditor settings panel
emitted libraries-changed which triggered refreshKeywords, but that
fired BEFORE the file was saved — the backend's _scan_repo_files
re-read the still-old version on disk and never picked up the new
import. After the user saved, no further refresh was triggered, so
the palette stayed on the static-fallback "(examples)" view of the
new library.

Trigger refreshKeywords after handleSave for .robot and .resource
files so the post-save scan catches the actual import set.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Selecting a step in the flow canvas now responds to Backspace
(macOS) and Delete (Windows / Linux) the same way as clicking the
"x" button on the detail panel. Focus check skips the handler when
the user is typing into an input / textarea / select / contenteditable
element so the detail-panel inputs and library autocomplete keep
their normal text-deletion behavior.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Walks every keyword / assignment step in form.testCases and
form.keywords for the currently-open file, resolves each step's
keyword to its declaring library via getKeywordInfo, and tallies
the count. Passes the resulting Map<library, count> into
KeywordPalette which sorts library categories by count desc — so
the libs the user is actually using show up first instead of
following a fixed dynamic / static-fallback insertion order.

Stable sort means ties keep their natural order (dynamic libs in
their original order, _ALWAYS_VISIBLE_LIBS fallback order behind).
Project: categories keep their separate isCurrentFile pinning, and
the Control category stays at the bottom.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Threads selectedNode.id through every editable node template as a
:selected boolean: KeywordNode and ControlNode accept it as a prop
and toggle a flow-node--selected modifier; the inline comment and
flow-control templates inside FlowEditor.vue do the same on their
root divs. The modifier paints a 3px primary-color outline plus a
soft halo and 10% tint background so the active node reads at a
glance against the canvas.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
addNodeFromPalette pushed every new step to the bottom of the
active list, which forced the user to click + drag back to where
they were working. Now when a node is selected, the new step is
spliced in at selectedNodeData.stepIndex + 1; control-flow steps
(IF / FOR / WHILE / TRY) keep their END marker right after.
With no selection, fall back to the original "append at end"
behavior. Selection moves to the freshly-inserted node so the user
can edit args or chain another insert without a second click.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The heal-rate KPI sat above the Success-Rate-Trend chart with a
purple gradient and a 4px purple accent border, making it the most
prominent thing on the Stats page. Demote it: move below the Flaky
Tests table, drop the gradient and the accent border, switch
purple text + bar gradient to neutral / primary tones, and wrap it
in a normal card-header with a heading so it reads as one section
of N rather than as a hero metric. Adds heading i18n strings in
EN/DE/FR/ES.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Startup recovery (in main.py lifespan) reset stranded sync_status='syncing'
rows on every backend start, but a row that gets stuck mid-session
between restarts stayed permanent: due_repos() filters out 'syncing'
and the scheduler skipped the row forever, so the user only got
relief by clicking Sync manually or restarting the backend.

Extend auto_sync_due_repos to scan for rows with sync_status='syncing'
AND updated_at older than _STALE_SYNCING_AFTER_MINUTES (10 min) at
the top of every tick. Stale rows get reset to 'error' with a
"Sync stuck > X min — auto-recovered. Click Sync to retry." message
so the UI surfaces *why* the row left in-flight state. Cooperates
with the optimistic flip in the dispatch loop — recovery runs first,
so a freshly-flipped row is never near the 10-min threshold.

Two unit tests cover the contract: a row at threshold + 5 min gets
recovered, a row that was just flipped is left alone (race-safe).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a small ⓘ next to the Auto-Sync and Pre-Run-Sync labels, plus
inside the manual Sync button, with native title=...
mouse-over / keyboard-focus tooltips that explain what each term
actually does. Pre-Run-Sync already had the help string in place
but no visible affordance — promotes the existing key to a real
icon and adds the missing autoSyncHelp / syncHelp keys in EN/DE/
FR/ES.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous version used the unicode character ⓘ (U+24D8) styled
with width/height + line-height and a faint muted color. Browsers
that render that glyph at a different baseline made it nearly
invisible against the card background, and the muted color washed
it out. Drop the glyph and replace with a literal "i" inside an
explicit bordered circle (1.5px solid border, italic serif font,
inline-flex centering) so the affordance reads regardless of font
support. Hover / focus flips to the primary color + halo.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaced the hover-only native title= tooltip with a real click
interaction: each (i) is a <button> that toggles a positioned
popover sibling carrying the help text. Outside-click and Escape
close the popover; the same icon clicked twice toggles it off.
Manual Sync row got restructured so the (i) lives next to the
button (separate trigger) instead of inside the button label —
the BaseButton click still kicks off the sync. Title attribute
kept as a hover-fallback / a11y label.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Wraps the repo card's name + git_url / local_path in a router-link
to /explorer/{repo.id} so the user can deep-link into the project's
files by clicking the most prominent thing in the card. Hover
flips the name to primary color and underlines the url so the
affordance is obvious. The existing "Explorer" button remains as
the redundant secondary action; the new link just makes the whole
title block clickable like the Dashboard cards already are.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the 80% max-height cap with a top:12px / bottom:12px
anchored layout so the detail panel uses the whole vertical canvas
minus a 12px gutter on each side. Tall forms (Browser keywords
with 15+ args, FOR loops with several IN-values) no longer force
the user into a tiny internal scroll viewport competing with the
canvas wheel scroll.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a thin vertical resizer at the left edge of the FlowEditor
detail panel. Pointer-down + drag widens (left) or narrows (right)
the panel; chosen width sits in component-local state so it
persists across selection changes inside the same Explorer view
and resets when the user navigates away. Hard caps at 240px /
720px keep the panel on-screen and leave room for the canvas.

Visual cue: a 2px accent line on the edge that hovers / highlights
while dragging. Pointer events bound to window so the user can
overshoot the panel without losing the drag.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…y + detail

Two unrelated UX issues bundled because they were reported together.

Recording verifier (Story S.3 evolution):
- LocatorFactory now returns MatchInfo (total / visible / actionable)
  instead of a plain int; legacy int factories are coerced via
  _coerce_match_info so existing tests continue to pass.
- v2_recorder_task wires Playwright is_visible / is_enabled per
  match (bounded at 50 to protect against runaway selectors).
- verify_candidates ranks by actionable_rank (0=gold, 1=visible-only,
  2=hidden, 3=unverified-multi) then quality_score desc. Gold
  candidates always sort before disambiguated multi-match ones, so
  the recorder defaults to a uniquely visible+enabled target.
- Penalty schedule: actionable=1 -> 0; visible=1, actionable=0 -> -5;
  total>=1, visible=0 -> -25 (kept as fallback for auto-heal but
  always loses to a visible alternative).
- New tests cover the gold/visible/hidden triage and the legacy-int
  back-compat path; existing test_sort_order updated to reflect the
  new "gold first regardless of score" contract.

ReportDetailView:
- Drop the standalone "Detailbericht" tab; render the keyword tree
  directly under the AI-analysis card on the summary tab so the
  deep view is one scroll away rather than a tab click. HTML Report
  stays its own tab because the iframe wants its own viewport.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…es; faster verify

Three bugs collapsed into one commit because they all surfaced from
the same user-reported recording failure:
\`Keyword 'Browser.Scroll To Element' expected 1 argument, got 0\`.

1. Scroll-on-document used to emit Scroll To Element with
   args={"target": "page"} and zero selector_candidates. Browser
   library's Scroll To Element REQUIRES a selector, so replay
   crashed with "expected 1 argument, got 0". There's no Browser
   keyword that round-trips a page-level scroll without a target,
   and these events were already documented as noise. Drop them.

2. Targeted-keyword-without-selector path (web + desktop) used to
   emit \`<Keyword>    # WARNING: no selector captured\` which RF
   parsed as a zero-arg call to that keyword, crashing the same
   way at replay. Emit the entire line as a single RF comment
   (\`# RBSCOPE: dropped <Keyword>\`) so the gap is still visible
   without breaking the run. Diagnostic still goes through the
   recorder log stream at WARNING.

3. Selector verification (Story S.3) now does a single JS
   evaluate_all per candidate to compute total / visible /
   actionable counts in one round-trip, replacing the per-element
   is_visible() / is_enabled() loop that pushed the e2e budget
   well past 30s. Heuristic mirrors Playwright's is_visible
   (offsetParent, computed visibility/display, non-zero box) plus
   an aria-disabled / .disabled check.

Tests updated for the comment-only emit format (web + desktop)
plus a fake-Locator stub for the new evaluate_all contract. The
v2_recorder_e2e flake (recorder thread > 30s) is independent —
reproduces against HEAD~1 with the visibility commit reverted.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a small "preview" pill next to admin nav entries whose feature
surface is still in active development. NavItem gets an optional
\`preview\` flag; when set on Identity Providers or Teams, the
template renders a tinted accent-colour pill with a tooltip
explaining "Preview feature — interface and behavior may change
without notice." in EN/DE/FR/ES.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…anel too

The previous merge commit only touched the standalone ReportDetailView
page; the inline RunDetailPanel that renders inside the Execution
tab still had the three-tab structure. Apply the same change there:
drop the "detailed" tab, render the keyword tree below the AI
analysis section in the summary tab, with a heading row so the
deep view is recognisable. HTML Report stays its own tab.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…port

Adds two ↗ shortcuts inside the inline RunDetailPanel:

- Keyword Tree heading: opens the standalone /reports/<id> page so
  the user can scan the deep view full-width instead of inside the
  panel's height-constrained scroll viewport.
- HTML Report tab: opens the existing blob URL in a new browser
  tab via window.open(_, _blank) so the iframe can be popped out
  for side-by-side review of the run report.

i18n key reportDetail.openInNewTab added in EN/DE/FR/ES.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The "↗ In neuem Tab öffnen" button on the inline run detail panel
used to open the full /reports/<id> page (KPIs + failed tests +
AI analysis + keyword tree). User wanted the pop-out to contain
just the deep view they were looking at, without re-rendering the
sections they already saw in the panel.

Adds a dedicated route + view + layout for this:
- ReportDetailedView.vue — minimal page rendering only ReportXmlView
  for the given report id, plus a small "Detailbericht — Run #N"
  header.
- MinimalLayout.vue — slot-only layout (no sidebar / no header) so
  the popped-out window contains just the chosen content.
- /reports/:id/detailed route wired with meta.layout='minimal'.
- RunDetailPanel.openReportInNewTab now resolves the new
  'report-detailed' name. The HTML Report tab's open-in-new-tab
  still pops the existing blob URL, unchanged.
- New i18n key reportDetail.notFound in EN/DE/FR/ES.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The /reports/{id}/html endpoint streamed the report.html with a
\`<base href>\` injected, and the frontend wrapped that response in a
Blob URL for the iframe src. Robot Framework's in-page click
handlers do \`window.location.href = "log.html#xxx"\` to jump from
report.html to log.html — and JS-driven navigation resolves against
the IFRAME'S URL, not against the injected \`<base>\`. The blob URL
has no path, so the navigation went to localhost:5173/log.html and
404'd. User-visible symptom: clicking on a test row in the report
did nothing / broke the iframe.

Change /html to authenticate the caller and 302-redirect to
/reports/{id}/assets/report.html?at=<asset_token>. The iframe now
loads a real URL whose siblings (log.html, screenshots, …) live at
predictable relative paths, so JS-initiated navigation stays inside
the asset endpoint and resolves correctly. Asset token is still
short-lived + report-scoped (Story SECURITY-3 unchanged).

Frontend getReportHtmlBlobUrl just returns the API URL with the
JWT in the query (iframes can't carry Bearer headers); kept the
async signature so existing \`await\` callsites stay unchanged.

Tests updated for the redirect contract; round-trip test in
test_router asserts the asset URL fetched with the minted token
returns the raw report HTML.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
After the previous commit moved the iframe to load directly from
/reports/{id}/assets/report.html?at=<token>, clicking on a test row
in the report still produced "authentication required" — Robot
Framework's in-page JS reconstructs href values via templating
(\`item.logURL + '#' + id\`) and the resulting `location.href = …`
navigation observably DROPS the source URL's query string in some
browsers, so the asset endpoint received the request without the
\`at=…\` token and 401'd.

Layer the previous \`<base href>\` injection back in, this time at
the asset endpoint when serving any HTML file: each report.html /
log.html now gets a freshly-minted asset token in a \`<base href>\`
tag so HTML-defined AND JS-driven navigation both resolve through
the base — query inheritance is no longer load-bearing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two unrelated bugs cleared up after the user kept clicking through
recordings.

reports: clicking a fragment-only link in the HTML report (e.g.
\`<a href="#totals">\`) resolved through the injected base href to
\`/reports/{id}/assets/?at=…#totals\`, which the asset endpoint
parsed as an empty file_path and 404'd ("File not found"). Inject
the same fragment-fix script the inline /html endpoint used to ship:
intercepts clicks on \`a[href^="#"]\` and just sets
window.location.hash so the browser scrolls without re-fetching.

editor: \`isCustomSelectorValue\` flagged recorder-emitted selectors
as "eigener Wert" when the emitted value carried decorations the
sidecar didn't — specifically \`iframe[src*="<host>"] >>> \` prefix
for iframe-captured events and \` >> nth=N\` suffix for multi-match
disambiguation. Strip both before comparing so the picker no longer
yells about Sourcepoint-consent recordings (\`Click text="Zustimmen"\`
inside a Sourcepoint frame).

3 new unit tests pin the new behavior: iframe-wrapped + nth-suffix
match, plus a counter-test that a genuinely typed selector with the
iframe prefix still shows as custom.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The Robot CodeMirror tokenizer's ESCAPE_SEQ regex listed the chars
the lexer treats as escapable (\\n, \\t, \\$, \\{, \\}, …) but \\#
was missing. So a cell like \`\\#login-form\` (RF's idiom for a
literal CSS-id selector that would otherwise be parsed as a line
comment) hit the catch-all stream.next() on the backslash, then
the very next iteration matched \`^#.*\` as an inline comment and
greyed out the rest of the line.

Add \`#\` to the escape set so \`\\#\` is consumed as one
string-2 token; the comment regex never gets a chance to fire.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the previous KPI / recent-runs / repo-grid mix on the
dashboard with a card grid pointing into every navigable section
of the app — Repos, Explorer, Runs, Stats, Recorder, Environments,
Docs, Settings — plus a static "Tip of the day" card.

Each navigation card shows an icon, title, short description and a
hover-animated chevron. Cards visible to a user respect their role
(Recorder + Environments need editor, Settings needs admin).

Tip-of-the-day picker (utils/dailyTip.ts) maps day-of-year mod 30
to one of 30 i18n keys (`tips.tip01`…`tips.tip30`). Tips focus
exclusively on RoboScope-specific behaviour — Flow Editor palette,
Recorder selector picker, Self-Healing keywords, Stats heal-rate +
flake-quarantine, Repos auto-sync / pre-run-sync / auto-recovery,
Run-Detail panel + AI patches, dashboard cards themselves, etc. —
not generic Robot Framework tips.

All four locales (EN/DE/FR/ES) carry the dashboard.cards.* + tips.*
strings.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
First-time RoboScope users now find a ready-to-use reference suite
in their project list — the public github.com/raffelino/robot-framework-examples
repo (61 headless tests, Apache-2.0). The lifespan handler checks
for an existing Repository with that name; if missing, it inserts
the row with auto_sync=False (no surprise background pulls) and
dispatches the initial clone task so the working tree fills in
within seconds.

Failure of the dispatch is logged but doesn't crash startup —
user can still see the row and click Sync manually.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The test asserted that selecting robotframework-browser-batteries
skipped the nodejs + rfbrowser init pipeline, on the original
assumption that batteries was self-contained. Story Playwright-fix-E
established that batteries replaces the gRPC server binary but
does NOT bundle browser binaries, so both variants now go through
the same nodejs + rfbrowser init path. Update the test to match.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@raffelino
Copy link
Copy Markdown
Collaborator Author

Closing — superseded by the 0.9.0 release branch merge (348714d on main). The recorder-v2, BMAD docs, and Phase-4 work are all on main; this branch is 103 commits behind and conflicts irresolvably. Future work tracked via fresh feature branches per BMAD story.

@raffelino raffelino closed this May 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant