Skip to content

Commit aef3cfb

Browse files
committed
feat: add autonomous E2E CI failure fix workflow with skills and Playwright agents
Add a 7-phase skill-based workflow for autonomously investigating and fixing failing E2E CI tests. Includes /fix-e2e command, Playwright Test Agent definitions (healer/generator/planner), and supporting rules. Skills: parse-ci-failure, setup-fix-branch, deploy-rhdh, reproduce-failure, diagnose-and-fix, verify-fix, submit-and-review. Managed via rulesync (skills feature added) and synced to OpenCode, Claude Code, and Cursor.
1 parent 27b2c47 commit aef3cfb

46 files changed

Lines changed: 7151 additions & 1 deletion

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.claude/commands/fix-e2e.md

Lines changed: 147 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,147 @@
1+
---
2+
description: >-
3+
Autonomously investigate and fix a failing RHDH E2E CI test. Accepts a Prow
4+
job URL or Jira ticket ID. Deploys RHDH, reproduces the failure, fixes the
5+
test using Playwright agents, and submits a PR with Qodo review.
6+
---
7+
# Fix E2E CI Failure
8+
9+
Autonomous workflow to investigate, reproduce, fix, and submit a PR for a failing RHDH E2E test.
10+
11+
## Input
12+
13+
`$ARGUMENTS` — A Prow job URL, Jira ticket ID, or Jira URL:
14+
- **Prow URL**: `https://prow.ci.openshift.org/view/gs/...`
15+
- **Jira ticket ID**: `RHIDP-XXXX`
16+
- **Jira URL**: `https://redhat.atlassian.net/browse/RHIDP-XXXX`
17+
18+
## Workflow
19+
20+
Execute the following phases in order. Load each skill as needed for detailed instructions. If a phase fails, report the error and stop — do not proceed blindly.
21+
22+
### Phase 1: Parse CI Failure
23+
24+
**Skill**: `parse-ci-failure`
25+
26+
Parse the input to extract:
27+
- Failing test name and spec file path
28+
- Playwright project name
29+
- Release branch (main, release-1.9, etc.)
30+
- Platform (OCP, AKS, EKS, GKE)
31+
- Deployment method (Helm, Operator)
32+
- Error type and message
33+
- local-run.sh job name parameter
34+
35+
**Decision gate**: If the input cannot be parsed (invalid URL, inaccessible Jira ticket), report the error and ask the user for clarification.
36+
37+
### Phase 2: Setup Fix Branch
38+
39+
**Skill**: `setup-fix-branch`
40+
41+
Create a feature branch based on the correct upstream release branch:
42+
43+
```bash
44+
git fetch upstream <release-branch>
45+
git checkout -b fix/e2e-<test-description> upstream/<release-branch>
46+
```
47+
48+
If a Jira ticket was provided, include the ticket ID in the branch name:
49+
`fix/RHIDP-XXXX-e2e-<test-description>`
50+
51+
### Phase 3: Deploy RHDH
52+
53+
**Skill**: `deploy-rhdh`
54+
55+
Deploy RHDH to a cluster using `e2e-tests/local-run.sh`:
56+
57+
```bash
58+
cd e2e-tests
59+
./local-run.sh -j <job-name> -t <image-tag> -s
60+
```
61+
62+
Use deploy-only mode (`-s`) to skip automated test execution — we'll run the specific failing test manually.
63+
64+
Select the image tag based on the release branch:
65+
- `main``next`
66+
- `release-1.9``1.9`
67+
- `release-1.8``1.8`
68+
69+
After deployment completes, set up the local test environment:
70+
```bash
71+
source e2e-tests/local-test-setup.sh <showcase|rbac>
72+
```
73+
74+
**Decision gate**: If deployment fails, the `deploy-rhdh` skill has error recovery procedures. If deployment cannot be recovered after investigation, report the deployment issue and stop.
75+
76+
### Phase 4: Reproduce Failure
77+
78+
**Skill**: `reproduce-failure`
79+
80+
Run the specific failing test to confirm it reproduces locally:
81+
82+
```bash
83+
cd e2e-tests
84+
yarn playwright test <spec-file> --project=<project> --retries=0 --workers=1
85+
```
86+
87+
**Decision gates**:
88+
- **Consistent failure**: Proceed to Phase 5
89+
- **Flaky** (fails sometimes): Proceed to Phase 5, focus on reliability
90+
- **Cannot reproduce** (passes every time after 10 runs): Report that the failure cannot be reproduced locally, list possible environment differences, and ask the user how to proceed
91+
92+
### Phase 5: Diagnose and Fix
93+
94+
**Skill**: `diagnose-and-fix`
95+
96+
Analyze the failure and implement a fix:
97+
98+
1. **Classify the failure**: locator drift, timing, assertion mismatch, data dependency, platform-specific, deployment config
99+
2. **Use Playwright Test Agents**: Invoke the healer agent (`@playwright-test-healer`) for automated test repair — it can debug the test, inspect the UI, generate locators, and edit the code
100+
3. **Follow project conventions**: Use semantic selectors, Page Object Model, component annotations, proper utility classes
101+
4. **Cross-repo investigation**: If the issue is in deployment config, use Context7 or Sourcebot to search `rhdh-operator` and `rhdh-chart` repos
102+
103+
**Decision gate**: If the analysis reveals a product bug (not a test issue):
104+
1. Mark the test as `test.fixme()` with a descriptive comment
105+
2. Report the product bug (update Jira ticket if applicable)
106+
3. Proceed to Phase 6 with the `test.fixme()` change
107+
108+
### Phase 6: Verify Fix
109+
110+
**Skill**: `verify-fix`
111+
112+
Verify the fix:
113+
1. Run the fixed test once — must pass
114+
2. Run 5 times — must pass 5/5
115+
3. Run code quality checks: `yarn tsc:check`, `yarn lint:check`, `yarn prettier:check`
116+
4. Fix any lint/formatting issues
117+
118+
**Decision gate**: If the test still fails or is flaky, return to Phase 5 and iterate.
119+
120+
### Phase 7: Submit PR and Handle Review
121+
122+
**Skill**: `submit-and-review`
123+
124+
1. **Commit**: Stage changes, commit with conventional format
125+
2. **Push**: `git push -u origin <branch>`
126+
3. **Create PR**: Against `redhat-developer/rhdh`. Determine the GitHub username from the fork remote: `git remote get-url origin | sed 's|.*github.com[:/]||;s|/.*||'`. Then use `gh pr create --repo redhat-developer/rhdh --head <username>:<branch> --base <release-branch>`
127+
4. **Trigger Qodo review**: Comment `/agentic_review` on the PR
128+
5. **Wait for review**: Poll for Qodo bot comments (check every 60s, up to 10 minutes)
129+
6. **Address feedback**: Apply valid suggestions, explain rejections
130+
7. **Monitor CI**: Watch CI checks with `gh pr checks`
131+
132+
### Final Report
133+
134+
After all phases complete, produce a summary:
135+
136+
```
137+
E2E Fix Summary:
138+
- Input: <Prow URL or Jira ticket>
139+
- Test: <spec file> (<playwright project>)
140+
- Branch: <fix branch> → <release branch>
141+
- Root cause: <classification and description>
142+
- Fix: <what was changed>
143+
- Verification: <X/X passes>
144+
- PR: <PR URL>
145+
- CI Status: <PASS/PENDING/FAIL>
146+
- Qodo Review: <status>
147+
```

0 commit comments

Comments
 (0)