Skip to content

docs: add CI Medic Guide and fix stale CI documentation#4511

Open
zdrapela wants to merge 12 commits intoredhat-developer:mainfrom
zdrapela:e2e-guide
Open

docs: add CI Medic Guide and fix stale CI documentation#4511
zdrapela wants to merge 12 commits intoredhat-developer:mainfrom
zdrapela:e2e-guide

Conversation

@zdrapela
Copy link
Copy Markdown
Member

@zdrapela zdrapela commented Mar 31, 2026

Summary

  • Add CI Medic Guide for investigating e2e test failures in nightly jobs and PR checks
  • Fix stale references across CI docs (IBM Cloud → OCP, wrong Slack channel, org membership)
  • Update nightly testing SVG diagram: replace IBM Cloud references with OCP, fix Slack channel name
  • Update CI.md: fix /ok-to-test org scope, add OSD-GCP description, simplify platform descriptions
  • Update .ci/pipelines/README.md: replace hardcoded cluster pools with link to OpenShift CI docs
  • Update rulesync CI rule: fix function refs, add missing job handlers
  • Gitignore *.local.md files

Test plan

  • Review each doc for accuracy against current CI scripts
  • Verify all links resolve correctly
  • PR review comments resolved

🤖 Generated with Claude Code

@openshift-ci
Copy link
Copy Markdown

openshift-ci bot commented Mar 31, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@github-actions
Copy link
Copy Markdown
Contributor

The container image build workflow finished with status: cancelled.

@github-actions
Copy link
Copy Markdown
Contributor

The container image build workflow finished with status: cancelled.

@github-actions
Copy link
Copy Markdown
Contributor

Image was built and published successfully. It is available at:

zdrapela and others added 6 commits April 1, 2026 11:54
Comprehensive guide covering Prow job anatomy, artifact navigation,
job lifecycle phases, failure triage workflows, and the AI Test Triager.
Includes an internal companion (local.md) with Vault, ReportPortal,
DevLake, and artifact unredaction details.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix Slack channel name: #rhdh-e2e-test-alerts → #rhdh-e2e-alerts
- Replace janus-idp org with openshift org for /ok-to-test
- Remove IBM Cloud nightly tests (no longer used)
- Add missing platforms: EKS, OSD-GCP, OCP Operator
- Fix function refs: run_tests() in utils.sh → testing::run_tests() in lib/testing.sh
- Fix typos (environmentr, test calimed)
- Add pointer to CI Medic Guide for detailed triage

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Replace hardcoded cluster pool list with link to OpenShift CI docs
- Remove stale IBM Cloud migration history
- Fix example Prow URL (janus-idp_backstage-showcase → redhat-developer-rhdh)
- Replace inline module list with link to lib/README.md
- Add release branch Slack channels to enhanced-ci-reporting.md
- Add link to CI Medic Guide

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix function refs: run_tests() → testing::run_tests() in lib/testing.sh
- Fix Slack channel name: #rhdh-e2e-test-alerts → #rhdh-e2e-alerts
- Replace hardcoded cluster pools with link to OpenShift CI docs
- Add all missing job handlers (aks/eks/gke-operator, osd-gcp, upgrade)
- Add showcase-runtime-db to Playwright project list

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add "How to Use This Guide" chapter with day-1 setup and rotation lookup table
- Replace static cheat sheet with pointers to AI Test Triager and resolved Jira ci-fail issues
- Replace duplicated script usage sections with pointers to source docs
- Simplify cloud platform sections with practical failure patterns
- Consolidate documentation/key files tables into concise related docs section

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Simplify OSD-GCP nightly test description
- Fix /ok-to-test org scope to include both openshift and redhat-developer

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 1, 2026

The container image build workflow finished with status: cancelled.

- /ok-to-test requires membership in both openshift and redhat-developer orgs
- Add OSD-GCP description in CI Medic Guide job types

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@zdrapela zdrapela marked this pull request as ready for review April 1, 2026 10:01
@openshift-ci openshift-ci bot requested review from albarbaro and hopehadfield April 1, 2026 10:01
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 1, 2026

The container image build workflow finished with status: cancelled.

@rhdh-qodo-merge
Copy link
Copy Markdown

rhdh-qodo-merge bot commented Apr 1, 2026

PR Reviewer Guide 🔍

(Review updated until commit 2a98bb2)

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
🔒 Security concerns

Sensitive information exposure:
The PR adds references to internal tooling and resources (Google Docs, Jira dashboards) and operational guidance about accessing ephemeral clusters via ocp-cluster-claim-login.sh. While no secrets are included directly, ensure the new documentation does not unintentionally reveal internal-only URLs, processes, or access assumptions to a public audience. Consider marking internal links as “Red Hat internal” and verifying that any referenced documents require appropriate authentication.

⚡ Recommended focus areas for review

Security/Access

The guide adds multiple links to internal resources (Google Docs, Jira dashboards) and describes org membership requirements and cluster access. Please validate that none of the linked pages or described access steps unintentionally expose sensitive/internal-only information to public readers, and consider clearly labeling internal-only links/sections (and any prerequisites) to avoid confusion for external contributors.

## Overview

### What is a CI Medic?

The CI medic is a **weekly rotating role** responsible for maintaining the health of PR checks and nightly E2E test jobs. When your rotation starts, you'll receive a Slack message with your responsibilities as a reminder. The complete role description is described in [this Google Doc](https://docs.google.com/document/d/1CjqSQYA6g35-95OpHXobcJdWFRGS5yu-MV8-mfuDmQA/edit?usp=sharing)

### Core Responsibilities

1. **Monitor PR Checks**: Keep an eye on the status and the queue to ensure they remain passing.
2. **Monitor Nightly Jobs**: Watch the `#rhdh-e2e-alerts` Slack channel and dedicated release channels.
3. **Triage Failures**:
   - Use the **AI Test Triager** (`@Nightly Test Alerts` Slack app) as your starting point -- it automatically analyzes failed nightly jobs and provides root cause analysis, screenshot interpretation, and links to similar Jira issues. You can also invoke it manually by tagging `@Nightly Test Alerts` in Slack.
   - Check [Jira](https://redhat.atlassian.net/jira/dashboards/21388#v=1&d=21388&rf=acef7fac-ada0-4363-b3fb-9aad7ae021f0&static=f0579c09-f63e-45aa-87b9-05e042eee707&g=60993:view@0a7ec296-c2fd-4ddc-b7cb-64de0540e8ba) for existing issues with the **`ci-fail`** label.
   - If it's a **new issue**, create a bug and assign it to the responsible team or person. The AI triager can also create Jira bugs directly.
   - If the failure **blocks PRs**, mark the test as skipped (`test.fixme`) until it is fixed.
4. **Monitor Infrastructure**: Watch `#announce-testplatform` for general OpenShift CI outages and issues. Get help at `#forum-ocp-testplatform`.
5. **Quality Cabal Call**: Attend the call and provide a status update of the CI.

### Where Do Alerts Come In?

- **Main branch**: `#rhdh-e2e-alerts` Slack channel
- **Release branches**: Dedicated channels like `#rhdh-e2e-alerts-1-8`, `#rhdh-e2e-alerts-1-9`, etc.
- **Infrastructure announcements**: `#announce-testplatform` (general OpenShift CI status)
- **Getting help**: `#forum-ocp-testplatform` (ask questions about CI platform issues, or see if others face similar issues)

Each alert includes links to the job logs, artifacts, and a summary of which deployments/tests passed or failed. Check the bookmarks/folders in the `#rhdh-e2e-alerts` channel for additional resources.

### Two Types of CI Jobs

| | Nightly (Periodic) Jobs | PR Check (Presubmit) Jobs |
|---|---|---|
| **Trigger** | Scheduled (usually once per night) | On PR creation/update, or `/ok-to-test` |
| **Scope** | Full suite: showcase, RBAC, runtime, sanity plugins, localization, auth providers | Smaller scope: showcase + RBAC only |
| **Platforms** | OCP (multiple versions), AKS, EKS, GKE, OSD-GCP | OCP only (single version) |
| **Install methods** | Helm and Operator | Helm only |
| **Alert channel** | `#rhdh-e2e-alerts` / `#rhdh-e2e-alerts-{version}` | PR status checks on GitHub |

**Triggering jobs on a PR**: All nightly job variants can also be triggered on a PR by commenting `/test <job-name>`. Use `/test ?` to list all available jobs for that PR. This is useful for verifying a fix against a specific platform or install method before merging.

---

## How to Use This Guide

This guide is a **reference**, not a textbook. You don't need to read it cover-to-cover before your rotation starts. Instead, use it as a companion that you come back to as situations arise during the week.

### Getting Started (Day 1)

When your rotation begins:

1. **Read the [Overview](#overview)** above to understand the role and where alerts come in.
2. **Familiarize yourself with the [Useful Links and Tools](#useful-links-and-tools)** section -- open the Prow dashboards, join the Slack channels, and make sure you have access.
3. **Review the [Internal Resources doc](https://docs.google.com/document/d/1yiMU-u2v8_rC-TBawcaJwV5jAvWcbTjhspuTe3KNcCo/edit?usp=sharing)** -- it covers Vault secrets, ReportPortal dashboards, DevLake analytics, and how to unredact artifacts. These are internal tools you'll need during triage.
4. **Try the [AI Test Triager](#ai-test-triager-nightly-test-alerts)** on a recent failure in `#rhdh-e2e-alerts` to see how it works. It will handle most of the initial analysis for you.

That's enough to start triaging.

### During Your Rotation

Use the rest of the guide on demand as you encounter specific situations:

| Situation | Section to consult |
|-----------|-------------------|
| A job failed and you need to find the logs | [Where to Find Logs and Artifacts](#where-to-find-logs-and-artifacts) |
| You can't tell *where* in the pipeline it broke | [Job Lifecycle and Failure Points](#job-lifecycle-and-failure-points) |
| You need to understand what a specific job does | [Job Types Reference](#job-types-reference) |
| You're unsure if it's infra, deployment, or a test bug | [Identifying Failure Types](#identifying-failure-types) |
| You need to re-trigger a job or access a cluster | [Useful Links and Tools](#useful-links-and-tools) |

### Understanding the CI Scripts

The guide links heavily to scripts in `.ci/pipelines/`. You don't need to read those scripts upfront either. When you're investigating a failure and need to understand what a specific phase does, follow the links from the relevant [Job Lifecycle](#job-lifecycle-and-failure-points) or [Job Types](#job-types-reference) section to the source code.

Key entry points if you do want to explore:
- [`.ci/pipelines/openshift-ci-tests.sh`](../../.ci/pipelines/openshift-ci-tests.sh) -- the main dispatcher, start here to understand how jobs are routed
- [`.ci/pipelines/jobs/`](../../.ci/pipelines/jobs/) -- one handler per job type, each is self-contained
- [`.ci/pipelines/lib/testing.sh`](../../.ci/pipelines/lib/testing.sh) -- how tests are executed, health-checked, and artifacts collected

### Improving This Guide

This guide is a living document. When you finish your rotation:

- **Update outdated information** -- job names, namespaces, and platform details change over time.
- **Clarify anything that confused you** -- if you had to figure something out the hard way, save the next person the trouble.
- **Remove stale content** -- if a job type or failure mode no longer exists, remove it rather than leaving it to confuse future medics.

Small, incremental improvements after each rotation keep this guide accurate and useful.

---

## Anatomy of a Prow Job

### Job Naming Convention

Nightly jobs follow this pattern:

periodic-ci-redhat-developer-rhdh-{BRANCH}-e2e-{PLATFORM}-{INSTALL_METHOD}[-{VARIANT}]-nightly


Breaking it down:

| Segment | Values | Meaning |
|---------|--------|---------|
| `{BRANCH}` | `main`, `release-1.9`, `release-1.10` | Git branch being tested |
| `{PLATFORM}` | `ocp`, `ocp-v4-{VER}`, `aks`, `eks`, `gke`, `osd-gcp` | Target platform (OCP versions rotate as new releases come out) |
| `{INSTALL_METHOD}` | `helm`, `operator` | Installation method |
| `{VARIANT}` | `auth-providers`, `upgrade` | Optional -- specialized test scenario |

Examples:

- `periodic-ci-redhat-developer-rhdh-main-e2e-ocp-helm-nightly` -- OCP nightly with Helm on main
- `periodic-ci-redhat-developer-rhdh-release-1.9-e2e-aks-helm-nightly` -- AKS nightly for release 1.9
- `periodic-ci-redhat-developer-rhdh-main-e2e-ocp-operator-nightly` -- OCP nightly with Operator
- `periodic-ci-redhat-developer-rhdh-main-e2e-ocp-operator-auth-providers-nightly` -- Auth provider tests
- `periodic-ci-redhat-developer-rhdh-main-e2e-ocp-helm-upgrade-nightly` -- Upgrade scenario tests

PR check jobs use the `pull-ci-` prefix instead of `periodic-ci-`.

### How the Pipeline Works

[Prow](https://docs.ci.openshift.org/docs/architecture/prow/) is the CI scheduler. It triggers [ci-operator](https://docs.ci.openshift.org/docs/architecture/ci-operator/), which orchestrates the entire workflow:

Prow (scheduler)
└── ci-operator (orchestrator) ── openshift/release repo
├── 1. Claim/provision cluster: ── (ci-operator config
│ - OCP: ephemeral cluster from Hive ── + step registry)
│ - AKS/EKS: provisioned on demand via Mapt
│ - GKE: long-running shared cluster
├── 2. Clone rhdh repo & Wait for RHDH image (if it needs to be built) ── openshift/release repo
├── 3. Run test step in e2e-runner image ── rhdh repo
│ ├── a. Install operators (Tekton, etc.) ── (.ci/pipelines/
│ ├── b. Deploy RHDH (Helm or Operator) ── openshift-ci-tests.sh)
│ ├── c. Wait for deployment health check
│ ├── d. Run Playwright tests
│ └── e. Collect artifacts
├── 4. Run post-steps ── openshift/release repo
│ (send Slack alert, collect must-gather) ── (step registry)
└── 5. Release cluster


the test step (2, 3) run inside the [`e2e-runner`](https://quay.io/repository/rhdh-community/rhdh-e2e-runner?tab=tags) image, which is built by a [GitHub Actions workflow](../../.github/workflows/push-e2e-runner.yaml) and mirrored into OpenShift CI.

Each phase can fail independently. Knowing *where* in this pipeline the failure occurred is the first step in triage.

---

## Where to Find Logs and Artifacts

### Navigating the Prow UI

When you click on a failed job (from Slack alert or Prow dashboard), you land on the **Spyglass** view. This page shows:

- **Job metadata**: branch, duration, result
- **Build log**: the top-level `build-log.txt` (ci-operator output)
- **JUnit results**: parsed test results if available (if Playwright ran and test cases failed)
- **Artifacts link**: link to the full GCS artifact tree

### Monitoring a Running PR Check in Real Time

While a PR check is running, you can monitor its live progress, logs, and system resource usage directly in the OpenShift CI cluster console.

**How to find the link:**

1. Open the Prow job page for the PR check (e.g., from the GitHub PR status check "Details" link). The URL looks like:

https://prow.ci.openshift.org/view/gs/test-platform-results/pr-logs/pull/redhat-developer_rhdh/{PR_NUMBER}/{JOB_NAME}/{BUILD_ID}

2. In the **build log**, look for a line near the top like:

Using namespace https://console.build08.ci.openshift.org/k8s/cluster/projects/ci-op-XXXXXXXX

3. Click that link to open the OpenShift console for the CI namespace where the job is running.

**What you can see in the CI namespace:**

- **Pods**: All pods running for the job (test container, sidecar containers, etc.)
- **Pod logs**: Live streaming logs from each container
- **Events**: Kubernetes events (scheduling, image pulls, failures)
- **Resource usage**: CPU and memory metrics for the running pods
- **Terminal**: You can open a terminal into a running pod for live debugging

This is especially useful when:
- A job is hanging and you want to see what it's doing right now
- You need to check pod resource consumption (OOM suspicion)
- You want to watch deployment progress in real time rather than waiting for artifacts

**Logging into the claimed cluster (OCP jobs):** While a job is executing, you can also log into the ephemeral OCP cluster using [`ocp-cluster-claim-login.sh`](../../.ci/pipelines/ocp-cluster-claim-login.sh). See [`.ci/pipelines/README.md`](../../.ci/pipelines/README.md) for prerequisites, access requirements, and usage.

**Prerequisite**: You must be a member of the `openshift` GitHub organization. Request access at [DevServices GitHub Access Request](https://devservices.dpp.openshift.com/support/github_access_request/).

### Artifact Directory Structure

artifacts/


</details>

<details><summary><a href='https://github.com/redhat-developer/rhdh/pull/4511/files#diff-aac25f9d3268c0444314629609063fc7c71cd79f3ce61e9c8b73eda238f348f9R17-R55'><strong>Potential Inaccuracy</strong></a>

The updated `/ok-to-test` guidance states it requires membership in both `openshift` and `redhat-developer`. In many OpenShift CI setups the permission is governed by Prow plugin configuration (often based on `openshift` org and/or repo-specific ACLs). Confirm the exact org/team requirement matches current Prow configuration; otherwise contributors may be blocked by incorrect documentation.
</summary>

```markdown

1. **Commenting `/ok-to-test`:**
   - **Purpose:** This command is used to validate a PR for testing, especially important for external contributors or when tests are not automatically triggered.
   - **Who Can Use It:** Only members of both the [openshift](https://github.com/openshift) and [redhat-developer](https://github.com/redhat-developer) GitHub organizations can mark the PR with this comment.
   - **Use Cases:**
     - **External Contributors:** For PRs from contributors outside the organization, a member needs to comment `/ok-to-test` to initiate tests.
   - **More Details:** For additional information about `/ok-to-test`, please refer to the [Kubernetes Community Pull Requests Guide](https://github.com/kubernetes/community/blob/master/contributors/guide/pull-requests.md#more-about-ok-to-test).

2. **Triggering Tests Post-Validation:**
   - After an `openshift` and `redhat-developer` org member has validated the PR with `/ok-to-test`, anyone can trigger tests using the following commands:
     - `/test ?` to get a list of all available jobs
     - `/test e2e-ocp-helm` for mandatory PR checks
   - **Note:** Avoid using `/test all` as it may trigger unnecessary jobs and consume CI resources. Instead, use `/test ?` to see available options and trigger only the specific tests you need.
3. **Triggering Optional Nightly Job Execution on Pull Requests:**
   The following optional nightly jobs can be manually triggered on PRs targeting the `main` branch and `release-*` branches. These jobs help validate changes across various deployment environments by commenting the trigger command on PR.

   **Job Name Format:** Jobs follow the naming scheme `redhat-developer-rhdh-PLATFORM-[VERSION]-INSTALL_METHOD-[SPECIAL_TEST]-nightly` where:
   - `PLATFORM`: The target platform (e.g., `ocp`, `aks`, `gke`)
   - `VERSION`: The platform version (e.g., `v4-17`, `v4-18`, `v4-19`)
   - `INSTALL_METHOD`: The deployment method (e.g., `helm`, `operator`)
   - `SPECIAL_TEST`: Optional special test type (e.g., `auth-providers`, `upgrade`)

   Use `/test ?` to see the complete list of available jobs for your specific branch and PR context.

These interactions are picked up by the OpenShift-CI service, which sets up a test environment. The configurations and steps for setting up this environment are defined in the `openshift-ci-tests.sh` script. For more details, see the [High-Level Overview of `openshift-ci-tests.sh`](#high-level-overview-of-openshift-ci-testssh).

### Retrying Tests

If the initial automatically triggered tests fail, OpenShift-CI will add a comment to the PR with information on how to retrigger the tests.

### CI Job Definitions

#### Pull Request Test Job

- **Purpose:** Validate new PRs for code quality, functionality, and integration.
- **Trigger:**
  - **Automatic:** When a PR includes code changes affecting tests (excluding doc-only changes), tests are automatically triggered.
  - **Manual:** When `/ok-to-test` is commented by an `openshift` and `redhat-developer` org member for external contributors or when `/test`, `/test images`, or `/test e2e-ocp-helm` is commented after validation.
- **Environment:** Runs on ephemeral OpenShift clusters managed by Hive. Kubernetes jobs use ephemeral EKS and AKS clusters on spot instances managed by [Mapt](https://github.com/redhat-developer/mapt). GKE uses a long-running cluster.
📚 Focus areas based on broader codebase context

Inconsistency

The newly documented nightly job naming convention includes an optional [-{VARIANT}] suffix and lists {PLATFORM} values like ocp-v4-{VER}. Validate that these conventions match the actual configured Prow job names used in this repo, and update the pattern/examples if variants and versioned platform tokens are not part of the real job names. (Ref 5)

Nightly jobs follow this pattern:

periodic-ci-redhat-developer-rhdh-{BRANCH}-e2e-{PLATFORM}-{INSTALL_METHOD}[-{VARIANT}]-nightly


Breaking it down:

| Segment | Values | Meaning |
|---------|--------|---------|
| `{BRANCH}` | `main`, `release-1.9`, `release-1.10` | Git branch being tested |
| `{PLATFORM}` | `ocp`, `ocp-v4-{VER}`, `aks`, `eks`, `gke`, `osd-gcp` | Target platform (OCP versions rotate as new releases come out) |
| `{INSTALL_METHOD}` | `helm`, `operator` | Installation method |
| `{VARIANT}` | `auth-providers`, `upgrade` | Optional -- specialized test scenario |

Reference reasoning: Existing e2e documentation describes a simpler nightly job name pattern ending in ...-e2e-{PLATFORM}-{METHOD}-nightly and points readers to the configured jobs list as the source of truth. Aligning the new guide’s pattern with that established convention (or explicitly reconciling differences) avoids stale or misleading job-name guidance.

📄 References
  1. redhat-developer/rhdh/docs/e2e-tests/CI.md [1-4]
  2. redhat-developer/rhdh/docs/e2e-tests/CI.md [5-8]
  3. redhat-developer/rhdh/docs/e2e-tests/CI.md [9-13]
  4. redhat-developer/rhdh/e2e-tests/README.md [463-474]
  5. redhat-developer/rhdh/e2e-tests/README.md [475-480]
  6. redhat-developer/rhdh/e2e-tests/README.md [429-431]
  7. redhat-developer/rhdh/e2e-tests/README.md [439-444]
  8. redhat-developer/rhdh/e2e-tests/README.md [432-438]

@rhdh-qodo-merge
Copy link
Copy Markdown

PR Type

Documentation


Description

  • Add comprehensive CI Medic Guide for investigating e2e test failures

  • Fix stale Slack channel references across CI documentation

  • Update CI documentation with current platform support and job handlers

  • Replace hardcoded cluster pool details with links to OpenShift CI docs


File Walkthrough

Relevant files
Documentation
9 files
CI-medic-guide.md
Add comprehensive CI Medic Guide for test failure investigation
+639/-0 
CI-medic-guide.local.md
Add internal companion guide for Vault, ReportPortal, DevLake
+181/-0 
README.md
Replace hardcoded cluster pools with OpenShift CI docs link
+17/-41 
CI.md
Fix Slack channel, org references, typos, and platform details
+21/-14 
enhanced-ci-reporting.md
Fix Slack channel name and add release branch channels     
+4/-4     
ci-e2e-testing.md
Update function refs, Slack channel, cluster pools, job handlers
+1/-1     
ci-e2e-testing.md
Update function refs, Slack channel, cluster pools, job handlers
+15/-31 
ci-e2e-testing.mdc
Update function refs, Slack channel, cluster pools, job handlers
+15/-31 
ci-e2e-testing.md
Update function refs, Slack channel, cluster pools, job handlers
+15/-31 

@rhdh-qodo-merge
Copy link
Copy Markdown

rhdh-qodo-merge bot commented Apr 1, 2026

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
General
Remove inaccessible internal links from documentation

Remove links to internal-only Google Docs from the public CI-medic-guide.md and
rephrase the text to avoid broken links for external contributors.

docs/e2e-tests/CI-medic-guide.md [24-635]

-The complete role description is described in [this Google Doc](https://docs.google.com/document/d/1CjqSQYA6g35-95OpHXobcJdWFRGS5yu-MV8-mfuDmQA/edit?usp=sharing)
+The complete role description is maintained internally.
 ...
-3. **Review the [Internal Resources doc](https://docs.google.com/document/d/1yiMU-u2v8_rC-TBawcaJwV5jAvWcbTjhspuTe3KNcCo/edit?usp=sharing)** -- it covers Vault secrets, ReportPortal dashboards, DevLake analytics, and how to unredact artifacts. These are internal tools you'll need during triage.
+3. **Review internal resource documentation** -- for internal team members, this covers Vault secrets, ReportPortal dashboards, DevLake analytics, and how to unredact artifacts. These are internal tools you'll need during triage.
 ...
-- [Internal Resources (Google Doc)](https://docs.google.com/document/d/1yiMU-u2v8_rC-TBawcaJwV5jAvWcbTjhspuTe3KNcCo/edit?usp=sharing) -- Vault secrets, ReportPortal, DevLake, unredacting artifacts (Red Hat internal)
+- Internal Resources -- For internal team members, documentation on Vault secrets, ReportPortal, DevLake, and unredacting artifacts is available internally.

[To ensure code accuracy, apply this suggestion manually]

Suggestion importance[1-10]: 6

__

Why: The suggestion correctly identifies that public-facing documentation contains links to inaccessible internal resources, and removing them improves the documentation's quality and user experience for external contributors.

Low
  • Update

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 1, 2026

The container image build workflow finished with status: cancelled.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 1, 2026

The container image build workflow finished with status: cancelled.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 1, 2026

The container image build workflow finished with status: cancelled.

Replace IBM Cloud references with OCP and fix Slack channel name
from #rhdh-e2e-test-alerts to #rhdh-e2e-alerts.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 1, 2026

The container image build workflow finished with status: cancelled.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 1, 2026

Image was built and published successfully. It is available at:

@zdrapela
Copy link
Copy Markdown
Member Author

zdrapela commented Apr 2, 2026

/review
-i

@rhdh-qodo-merge
Copy link
Copy Markdown

Persistent review updated to latest commit 2a98bb2

@zdrapela
Copy link
Copy Markdown
Member Author

zdrapela commented Apr 2, 2026

/agentic_review

@rhdh-qodo-merge
Copy link
Copy Markdown

rhdh-qodo-merge bot commented Apr 2, 2026

Code Review by Qodo

🐞 Bugs (1) 📘 Rule violations (0) 📎 Requirement gaps (0) 🎨 UX Issues (0)

Grey Divider


Remediation recommended

1. Wrong access prerequisite 🐞 Bug ≡ Correctness
Description
docs/e2e-tests/CI-medic-guide.md states that logging into a claimed cluster requires openshift
GitHub org membership, but the repo’s own CI docs and the ocp-cluster-claim-login.sh script indicate
access is gated by permissions to the hosted-mgmt namespace (with guidance to join
rhdh-pool-admins). This mismatch can send responders to request the wrong access and delay failure
triage.
Code

docs/e2e-tests/CI-medic-guide.md[R207-210]

+**Logging into the claimed cluster (OCP jobs):** While a job is executing, you can also log into the ephemeral OCP cluster using [`ocp-cluster-claim-login.sh`](../../.ci/pipelines/ocp-cluster-claim-login.sh). See [`.ci/pipelines/README.md`](../../.ci/pipelines/README.md) for prerequisites, access requirements, and usage.
+
+**Prerequisite**: You must be a member of the `openshift` GitHub organization. Request access at [DevServices GitHub Access Request](https://devservices.dpp.openshift.com/support/github_access_request/).
+
Evidence
The CI Medic Guide asserts an openshift GitHub org prerequisite, while the pipeline README and the
login script both describe access in terms of cluster/namespace permissions and rhdh-pool-admins
group membership guidance when access is forbidden.

docs/e2e-tests/CI-medic-guide.md[207-210]
.ci/pipelines/README.md[67-69]
.ci/pipelines/ocp-cluster-claim-login.sh[39-45]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`docs/e2e-tests/CI-medic-guide.md` states a prerequisite (membership in the `openshift` GitHub org) for using `ocp-cluster-claim-login.sh` that does not match the access guidance enforced/documented in the repo (namespace access / `rhdh-pool-admins`). This can misdirect CI medics during incident response.

### Issue Context
The guide already links to `.ci/pipelines/README.md` for prerequisites, but then adds a conflicting prerequisite line.

### Fix Focus Areas
- docs/e2e-tests/CI-medic-guide.md[207-210]

### Suggested change
- Remove or rewrite the "Prerequisite" line to match repo guidance (e.g., reference `rhdh-pool-admins` / hosted-mgmt namespace access as per `.ci/pipelines/README.md` and the script’s Forbidden handling), and keep `.ci/pipelines/README.md` as the single source of truth.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

ⓘ The new review experience is currently in Beta. Learn more

Grey Divider

Qodo Logo

@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud bot commented Apr 2, 2026

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 2, 2026

Image was built and published successfully. It is available at:

@openshift-ci
Copy link
Copy Markdown

openshift-ci bot commented Apr 2, 2026

@zdrapela: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-ocp-helm 6a86212 link true /test e2e-ocp-helm

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant