Skip to content

[TMHUB-31803] docs(uipath-test): align skill + smoke tests with @uipath/test-manager-tool v1.0 CLI surface#730

Open
ganeshborle wants to merge 1 commit into
mainfrom
docs/uipath-test-skill-v1-cli
Open

[TMHUB-31803] docs(uipath-test): align skill + smoke tests with @uipath/test-manager-tool v1.0 CLI surface#730
ganeshborle wants to merge 1 commit into
mainfrom
docs/uipath-test-skill-v1-cli

Conversation

@ganeshborle
Copy link
Copy Markdown
Contributor

@ganeshborle ganeshborle commented May 13, 2026

Tracks: TMHUB-31803

Summary

  • Rewrite skills/uipath-test/SKILL.md so every uip tm command table matches the release/v1.0 branch of UiPath/cli (packages/test-manager-tool v1.0.1). Plural top-level groups (testcases, testsets, executions), new rows for testcases run/add/remove, testsets run, executions get-stats/run/list-filtered, executions testcaselogs list (nested), testcaselog start/finish, teststeplog list, user get. Optional flags surfaced on report get, attachment download, result download, wait. Anti-patterns now flag the v0.9 → v1.0 renames and moved verbs.
  • Align command-pattern regexes in the three task YAMLs under tests/tasks/uipath-test/ to the same plural surface (auth_project_discovery.yaml unchanged; only testset_hierarchy_discovery.yaml and report_generation.yaml touched).

Out of scope

  • requirement group (only on cli/main, v1.1) — explicitly excluded so the skill stays a faithful match for release/v1.0.
  • No change to env_packages in the task sandboxes — they remain @uipath/cli / @uipath/test-manager-tool (unpinned). Pinning to @beta is a separate decision tied to the v1.0 npm publish.

⚠️ Why this is draft

@uipath/test-manager-tool@latest on npm is still 0.9.0. Until v1.0 is promoted to the @latest dist-tag, the smoke CI workflow's npm install -g @uipath/cli@latest will still pull the v0.9.x tool — which means:

  • the agent loads this skill,
  • runs uip tm testsets list ...,
  • the v0.9 tool returns unknown command 'testsets',
  • the test fails.

Do not merge until one of:

  1. @uipath/test-manager-tool v1.0 is on the @latest dist-tag on npm, or
  2. The env_packages lines in the three task YAMLs are pinned to @beta (one-liner per file) for the interim period.

Source of truth

CLI surface read from origin/release/v1.0 of UiPath/cli (packages/test-manager-tool@1.0.1), files:

attachment.ts  execution.ts   project.ts   report.ts   requirement.ts (not on v1.0)
result.ts      testcase.ts    testcaselog.ts  testset.ts  teststeplog.ts  user.ts  wait.ts

Test plan

  • Verify SKILL.md description still passes hooks/validate-skill-descriptions.sh (≤ 1024 chars).
  • After v1.0 publish (or with @beta pin), run smoke + integration locally:
    SKILLS_REPO_PATH=/path/to/skills coder-eval run \
      tests/tasks/uipath-test/auth_project_discovery.yaml \
      tests/tasks/uipath-test/testset_hierarchy_discovery.yaml \
      -e tests/experiments/default.yaml --tags smoke
    
    SKILLS_REPO_PATH=/path/to/skills coder-eval run \
      tests/tasks/uipath-test/report_generation.yaml \
      -e tests/experiments/integration.yaml --tags integration
    
  • Smoke-skills CI workflow green on the same SHA that flips this PR out of draft.

🤖 Generated with Claude Code

@ganeshborle ganeshborle force-pushed the docs/uipath-test-skill-v1-cli branch from 64eb031 to 1194849 Compare May 13, 2026 10:44
@ganeshborle ganeshborle self-assigned this May 14, 2026
@ganeshborle ganeshborle added the uipath-case-management UiPath skill area: uipath-case-management label May 14, 2026
@uipreliga
Copy link
Copy Markdown
Collaborator

Code Review Summary

This PR renames the uip tm CLI surface across SKILL.md from singular→plural (testcasetestcases, testsettestsets, executionexecutions), renames executerun, adds new groups (executions list-filtered, executions get-stats, executions retry, teststeplog list, testcaselog start/finish, user get), and reshapes list-testcaselogs into nested executions testcaselogs list. Tests get matching regex updates. Three reviewers (Opus + Codex + Gemini) converge on the same picture.

🔴 Critical

  1. references/publish-and-link-guide.md was not updated (lines 11–13, 21, 56, 64, 79, 82, 88, 105, 109). It still uses uip tm testcase list-automations, testcase link-automation, testcase execute, testset execute, testcase list, etc. SKILL.md:202 links this guide as the canonical workflow for "Publish a project and link it to a Test Manager test case". An agent that follows the link will run commands that no longer exist on the new CLI. (Flagged independently by Codex and Gemini.)

  2. references/test-result-report-guide.md:84 was not updated — still uses uip tm testcase list-result-history. SKILL.md:201 links this guide as the canonical workflow for the QA report generation flow that tests/tasks/uipath-test/report_generation.yaml exercises. (Flagged independently by Codex and Gemini.)

🟠 High

  1. SKILL.md:70 hint paragraph is incomplete. Lists --test-case-key consumers as update, delete, link-automation, unlink-automation, list-testsets, but the PR adds testcases add and testcases remove at lines 67–68 which take --test-case-keys (plural, comma-separated PROJECT_KEY:NUMBER values). The singular/plural flag name is precisely the kind of landmine the note exists to prevent. (Flagged independently by Codex and Gemini.)

  2. SKILL.md:142 Critical Rule feat(EvalsBreakdown): break down eval skill #1 has a dangling fragment. Reads "...before any Test Manager operation. Use \uip login`."— the trailing sentence has no antecedent. Suggest:"...If not authenticated, run `uip login` to sign in."`

🟡 Medium

  1. Heading hierarchy (SKILL.md:22–35). ## Concepts### What is Testmanager? → then ### Project Commands, ### Test Cases Commands, … all live as siblings of "What is Testmanager?", so the full command catalogue formally sits under ## Concepts. Either drop the ### What is Testmanager? subhead so the prose becomes the Concepts intro and introduce a sibling ## Commands header, or restructure.

  2. Spelling: "Testmanager" vs "Test Manager" (SKILL.md:23, 25). Heading and lead sentence write the product as one word; every other place in the file (and the brand) uses two words.

  3. executions list vs executions list-filtered (SKILL.md:89–90) overlap heavily. Agent has no decision rule on when to pick which.

  4. Trailing newline missing (SKILL.md:212) — \ No newline at end of file in diff. Cosmetic, but most lints flag it.

🟢 Low / informational

  1. Three different run verbs (SKILL.md:66 testcases run, :81 testsets run, :92 executions run) all have distinct semantics. Inherited from the CLI, but a one-line disambiguation in the doc would prevent confusion.

  2. testcases run flag style ambiguity (SKILL.md:66). Note says space-separated UUIDs; sibling testcases add/remove uses comma-separated. Worth confirming against the CLI to make sure the example matches.

  3. Coverage gap. PR specifically flags executions testcaselogs list (nested subcommand) as a landmine but adds no smoke test for it; both YAML changes still cover singular→plural renames only.

  4. Token-optimization rule drift. Repo .claude/rules/token-optimization.md says strip articles; many table Purpose cells retain "a new test case", "the failed test cases", "a summary report". Low priority on a doc-alignment PR.

Test changes

report_generation.yaml:52 and testset_hierarchy_discovery.yaml:44,52 regex updates correctly mirror the SKILL.md renames. The two-alternation pattern accepts both --flag-a ... --flag-b and --flag-b ... --flag-a orderings; extra flags interleaved still match via .*. ✅

Overall assessment

The SKILL.md rewrite itself is solid — table structure is clean, the Anti-patterns section explicitly calls out the nested-subcommand mistake, and the test YAMLs are aligned. The PR is materially incomplete because the two reference guides under references/ were not swept; SKILL.md actively points agents into those guides, so the workflows they teach are now broken. Treat the two critical findings as merge blockers; the high-severity ones are quick textual fixes in SKILL.md itself.


🤖 Generated with Claude Code (Opus 4.7) + multi-model review (Codex, Gemini)

@uipreliga uipreliga self-requested a review May 14, 2026 15:17
Copy link
Copy Markdown
Collaborator

@uipreliga uipreliga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Provided comments; ask the codeowners for the skill to approve.

ganeshborle added a commit that referenced this pull request May 14, 2026
Critical fixes:
- references/publish-and-link-guide.md: sweep singular testcase/testset/
  execute → plural testcases/testsets/run so the linked workflow actually
  matches the v1.0 CLI surface.
- references/test-result-report-guide.md:84: fix `testcase` →
  `testcases` in the commented decision table.

High:
- SKILL.md hint paragraph rewritten to call out all THREE
  test-case identifier flag shapes (--test-case-id UUID,
  --test-case-key singular, --test-case-keys plural-comma for
  testcases add/remove) — this is the landmine the note exists for.
- SKILL.md Critical Rule #1 dangling `Use uip login.` fragment
  replaced with a complete sentence.

Medium:
- Heading hierarchy: drop the 'What is Testmanager?' h3 so the
  command tables sit under a sibling '## Commands' rather than
  under '## Concepts'.
- 'Testmanager' → 'Test Manager' to match the brand and the rest
  of the file.
- Add a decision rule for executions list vs executions list-filtered.
- Add trailing newline at EOF.

Low:
- One-block disambiguation note for the three different `run`
  verbs (testcases run / testsets run / executions run) so the
  agent picks the right one.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@ganeshborle ganeshborle force-pushed the docs/uipath-test-skill-v1-cli branch 2 times, most recently from aa90da0 to b5040b4 Compare May 15, 2026 19:34
@ganeshborle ganeshborle marked this pull request as ready for review May 15, 2026 19:37
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 15, 2026

Claude finished @ganeshborle's task in 2m 23s —— View job


Coder-eval task lint (advisory)

2 task YAMLs changed; 0 Critical, 0 High, 1 Medium, 0 Low, 1 OK.

Rubric: .claude/commands/lint-task.md. This check is advisory and never blocks merge.

Evidence of passing run

High — PR body does not claim the changed tasks have been run and passed. The test plan checkboxes are all unchecked, and the PR notes that @uipath/test-manager-tool@latest on npm is still v0.9.0 (which would fail against the updated plural command surface). Please edit the PR description to add a line like: Ran skill-test-testset-hierarchy-discovery and skill-test-report-generation-qa locally and they passed. (after v1.0 is on @latest or with @beta pin).

Per-task lint

tests/tasks/uipath-test/report_generation.yaml — verdict: OK

Changes are minimal: executionexecutions in the command_pattern regex (line 53) and description (line 51). The task's design is solid — command_executed criteria verify the agent invoked uip login status and uip tm executions list with correct flags, while run_command criteria validate both file existence and substantive report content (regression section, pass/fail/none breakdown, test set reference via Python script). No issues.

tests/tasks/uipath-test/testset_hierarchy_discovery.yaml — verdict: Medium

Changes: testsettestsets in three command_pattern regexes (lines 45, 53) and descriptions (lines 43, 51).

Issues:

  • [Medium] Meaningful coverage: all three success_criteria (lines 34–56) are command_executed with min_count: 1 and no output validation. The test proves the agent ran the right discovery-chain commands (project list, testsets list, testsets list-testcases) with required flags, but does not verify that any command succeeded or returned data. An agent that runs uip tm testsets list --project-key INVALID --output json (which errors) would still pass.

Suggested fixes:

  • Consider adding a run_command criterion that re-runs one of the discovery commands (e.g., uip tm project list --output json) and checks expected_exit_code: 0, or pipes the output through a json_check asserting the response array is non-empty. Even one output-validation criterion would raise this to OK. Example:
    - type: run_command
      description: "project list returns valid JSON array"
      command: "uip tm project list --output json | python3 -c \"import json,sys; d=json.load(sys.stdin); assert isinstance(d,list) and len(d)>0\""
      timeout: 30
      expected_exit_code: 0
      weight: 1.0
      pass_threshold: 1.0
    (This is advisory — a smoke test with command_executed only is common and acceptable for this tier.)

Within-PR duplicates

No duplicate clusters detected. The two changed tasks exercise materially distinct operations (hierarchy discovery vs. report generation) at different tiers (smoke vs. integration).

Conclusion

⚠ 1 task has issues, max severity Medium. Evidence of passing run not yet claimed (expected — v1.0 CLI not yet published). Advisory only — not blocking merge.


@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 15, 2026

PR Review: uipath-test skill v1.0 CLI alignment

  • Read review criteria and project conventions
  • Read full diff and existing PR comments
  • Read each changed file in full for context
  • Run skill-specific checks (links, CODEOWNERS, cross-refs, secrets)
  • Post review findings

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b5040b4f8c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread tests/tasks/uipath-test/report_generation.yaml
@ganeshborle ganeshborle force-pushed the docs/uipath-test-skill-v1-cli branch from b5040b4 to 7beba20 Compare May 15, 2026 19:53
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7beba20d42

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread tests/tasks/uipath-test/testset_hierarchy_discovery.yaml
@ganeshborle
Copy link
Copy Markdown
Contributor Author

@bai-uipath Can you please have a look at the smoke run of this? Not sure why it is failing with the Test Manager is not enabled for this tenant. however it passes locally and test manager is enabled in the codereval org. Previously, it was picking some other org something of Autopilot.

Comment thread skills/uipath-test/SKILL.md
Comment thread skills/uipath-test/SKILL.md
|---|---|
| `uip tm testcases create --project-key <PROJECT_KEY> --name <TEST_CASE_NAME>` | Create a new test case in a Test Manager project. |
| `uip tm testcases list --project-key <PROJECT_KEY>` | List all test cases in a Test Manager project. |
| `uip tm testcases list --project-key <PROJECT_KEY>` | List all test cases in a Test Manager project. Optional `--filter <text>` to search by name/key. |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About filter, let's add a generic instruction about --filter or --search in 'Critical Rules' section

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any specific thing that we will mention in the critical rules section. As far as I understand, this is very command specific, if we add that in the critical section, coding agent might try these verbs for other commands as well. I think, we should keep it attached with the commands.

Comment thread skills/uipath-test/SKILL.md Outdated
Comment thread skills/uipath-test/SKILL.md Outdated
Comment thread skills/uipath-test/SKILL.md Outdated
…nt `uip tm` surface

Refresh `skills/uipath-test/SKILL.md` and its two reference guides so every
`uip tm` command, flag, and identifier shape matches the current Test
Manager CLI (`@uipath/test-manager-tool`):

- Top-level command groups: `testcases` / `testsets` / `executions`
  (plural). The previous singular forms (`testcase`, `testset`,
  `execution`) no longer exist on the CLI.
- Run verb: `run` everywhere (`testcases run`, `testsets run`,
  `executions run`), no longer `execute`. The three are distinct —
  added a disambiguation block calling out that `executions run` is the
  *re-run* variant, while `testcases run` / `testsets run` start new
  executions.
- Test case logs: surfaced under `uip tm executions testcaselogs list`
  (nested) and `uip tm testcaselog start / finish / list-assertions`
  (top-level, singular). Anti-patterns section names the nested-subcommand
  landmine explicitly.
- Test step logs: documented under `uip tm teststeplog list`.
- New verbs: `executions get-stats`, `executions list-filtered`
  (with a decision rule vs the simpler `executions list`),
  `executions retry`, `testcaselog start` / `finish`, `user get`.
- Three test-case identifier flag shapes called out as a single block:
  `--test-case-id` (UUID, for `run` / `list-steps` /
  `list-result-history`), `--test-case-key` (singular, for `update` /
  `delete` / `link-automation` / `unlink-automation` /
  `list-testsets`), and `--test-case-keys` (plural, comma-separated,
  for `testcases add` / `testcases remove`).

Reference guides swept for the same surface:
- `publish-and-link-guide.md`: pipeline diagram, Steps 4–6, and Common
  Pitfalls all updated to plural commands + `run`.
- `test-result-report-guide.md`: decision table updated to plural
  `testcases list-result-history`; Prerequisites prose tightened to
  reference `Test Manager project key` and `Test Manager test set key`
  (the canonical key terminology, not "id").

Smoke tests realigned to the new surface:
- `report_generation.yaml` and `testset_hierarchy_discovery.yaml`
  success-criteria regexes match the plural commands.

Other cleanup:
- Heading hierarchy: dropped `### What is Testmanager?` so command
  tables live under a sibling `## Commands` header (was nesting under
  `## Concepts`).
- Brand spelled `Test Manager` (two words) consistently.
- Critical Rule #1 reworded to a complete sentence instead of trailing
  `Use uip login.`.
- Trailing newline at EOF.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@ganeshborle ganeshborle force-pushed the docs/uipath-test-skill-v1-cli branch from 7beba20 to 55839f0 Compare May 18, 2026 07:29
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 55839f0514

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread tests/tasks/uipath-test/report_generation.yaml
Comment thread tests/tasks/uipath-test/testset_hierarchy_discovery.yaml
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

uipath-case-management UiPath skill area: uipath-case-management

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants