NO-JIRA: feat: add remaining netedge evaluations (follow-up to #138)#195
Conversation
This adds mcpchecker (gevals) evaluations for the four remaining netedge tools that were not covered in PR openshift#138: - inspect_route - get_service_endpoints - probe_dns_local - probe_http Also updates the NETEDGE-validation-how-to.md doc with the new test commands.
📝 WalkthroughWalkthroughReplaces the Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant Agent as Gemini Agent (acp)
participant Judge as Builtin LLM Judge
participant MCP as MCP Tools
participant K8s as Kubernetes Cluster
User->>Agent: run `gemini --acp` (mcpchecker-driven eval)
Agent->>Judge: submit evaluation prompt / data
Judge->>Agent: verdict / scoring
Agent->>MCP: invoke tool (e.g., inspect_route, probe_http, get_service_endpoints, probe_dns_local)
MCP->>K8s: perform requested action (kubectl/tool calls)
K8s-->>MCP: return results
MCP-->>Agent: tool output
Agent-->>User: aggregated evaluation output
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
evals/tasks/netedge/probe_http/task.yaml (1)
8-14: Constrain the asserted method to GET.The prompt requires GET, but current assertion only checks URL. Adding
method: "GET"makes the evaluation intent explicit and resilient to defaults.Suggested patch
assertions: toolsUsed: - server: kubernetes toolPattern: "probe_http" args: url: "https://kubernetes.default.svc:443" + method: "GET"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@evals/tasks/netedge/probe_http/task.yaml` around lines 8 - 14, The assertion for the probe_http tool currently only specifies args.url ("https://kubernetes.default.svc:443") and doesn't constrain the HTTP method; update the toolsUsed entry for toolPattern "probe_http" to include args.method set to "GET" so the evaluation explicitly requires an HTTP GET request (i.e., add method: "GET" alongside the existing url in the args for the probe_http tool).docs/openshift/NETEDGE-validation-how-to.md (1)
60-78: Optional: reduce command duplication by exporting env vars once.These four new commands are correct, but repeating the same export chain on every line makes updates error-prone.
Suggested doc cleanup
+export RH_GEMINI_API_KEY=$RH_GEMINI_API_KEY +export JUDGE_API_KEY=$OPENAI_API_KEY +export JUDGE_BASE_URL="https://api.openai.com/v1" +export JUDGE_MODEL_NAME="gpt-4o" + **Test 3: Inspect Route** ```bash -export RH_GEMINI_API_KEY=$RH_GEMINI_API_KEY && export JUDGE_API_KEY=$OPENAI_API_KEY && export JUDGE_BASE_URL="https://api.openai.com/v1" && export JUDGE_MODEL_NAME="gpt-4o" && ./gevals check evals/gemini-agent/eval.yaml --run "inspect-route" -v +./gevals check evals/gemini-agent/eval.yaml --run "inspect-route" -v...
</details> <details> <summary>🤖 Prompt for AI Agents</summary>Verify each finding against the current code and only fix it if needed.
In
@docs/openshift/NETEDGE-validation-how-to.mdaround lines 60 - 78, Move the
repeated export chain out of each individual command and set the environment
once (e.g., a preamble or an env block) so the four runs only call the runner;
remove the leading exports from the commands that invoke ./gevals check
evals/gemini-agent/eval.yaml --run "inspect-route" / "get-service-endpoints" /
"probe-dns-local" / "probe-http" and instead place: export
RH_GEMINI_API_KEY=...; export JUDGE_API_KEY=...; export
JUDGE_BASE_URL="https://api.openai.com/v1"; export JUDGE_MODEL_NAME="gpt-4o"
once above the commands (or document using an env file or wrapper script) so
future updates change a single location rather than each run invocation.</details> </blockquote></details> </blockquote></details> <details> <summary>🤖 Prompt for all review comments with AI agents</summary>Verify each finding against the current code and only fix it if needed.
Inline comments:
In@evals/tasks/netedge/probe_dns_local/task.yaml:
- Around line 8-15: The assertions currently check toolsUsed ->
server/toolPattern/args for the probe but do not enforce the DNS query type;
update the assertion for the probe_dns_local tool (toolPattern "probe_dns_local"
with server "172.30.0.10" and name "kubernetes.default.svc.cluster.local") to
include args.type set to "A" so the test only passes when an A-record query is
performed and returns the expected addresses/response code.
Nitpick comments:
In@docs/openshift/NETEDGE-validation-how-to.md:
- Around line 60-78: Move the repeated export chain out of each individual
command and set the environment once (e.g., a preamble or an env block) so the
four runs only call the runner; remove the leading exports from the commands
that invoke ./gevals check evals/gemini-agent/eval.yaml --run "inspect-route" /
"get-service-endpoints" / "probe-dns-local" / "probe-http" and instead place:
export RH_GEMINI_API_KEY=...; export JUDGE_API_KEY=...; export
JUDGE_BASE_URL="https://api.openai.com/v1"; export JUDGE_MODEL_NAME="gpt-4o"
once above the commands (or document using an env file or wrapper script) so
future updates change a single location rather than each run invocation.In
@evals/tasks/netedge/probe_http/task.yaml:
- Around line 8-14: The assertion for the probe_http tool currently only
specifies args.url ("https://kubernetes.default.svc:443") and doesn't constrain
the HTTP method; update the toolsUsed entry for toolPattern "probe_http" to
include args.method set to "GET" so the evaluation explicitly requires an HTTP
GET request (i.e., add method: "GET" alongside the existing url in the args for
the probe_http tool).</details> <details> <summary>🪄 Autofix (Beta)</summary> Fix all unresolved CodeRabbit comments on this PR: - [ ] <!-- {"checkboxId": "4b0d0e0a-96d7-4f10-b296-3a18ea78f0b9"} --> Push a commit to this branch (recommended) - [ ] <!-- {"checkboxId": "ff5b1114-7d8c-49e6-8ac1-43f82af23a33"} --> Create a new PR with the fixes </details> --- <details> <summary>ℹ️ Review info</summary> <details> <summary>⚙️ Run configuration</summary> **Configuration used**: Path: .coderabbit.yaml **Review profile**: CHILL **Plan**: Pro **Run ID**: `9e86cd1b-50a9-4fca-bd51-a8960fa8f9b8` </details> <details> <summary>📥 Commits</summary> Reviewing files that changed from the base of the PR and between 4ef4ee53951b47772ddb5c5e87b29aeecd7a7bee and 3c84ced1b9613f23846a0e5ccffde4e18701e24d. </details> <details> <summary>📒 Files selected for processing (5)</summary> * `docs/openshift/NETEDGE-validation-how-to.md` * `evals/tasks/netedge/get_service_endpoints/task.yaml` * `evals/tasks/netedge/inspect_route/task.yaml` * `evals/tasks/netedge/probe_dns_local/task.yaml` * `evals/tasks/netedge/probe_http/task.yaml` </details> </details> <!-- This is an auto-generated comment by CodeRabbit for review status -->
|
@coderabbitai why did the fips-image-scan-openshift-mcp-server check fail? |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 2185
You can see the details here: Prow job log All other CI jobs ( |
|
/test fips-image-scan-openshift-mcp-server |
…ent configuration - Pin DNS record type to 'A' in probe_dns_local task as requested in PR review. - Refine task prompts to be more explicit about using specialized tools. - Simplify gemini-agent configuration to use ACP (Agent Communication Protocol). - Update judge configuration to use builtin.llm-agent. Signed-off-by: bentito <btofel@redhat.com>
|
/assign @matzew |
There was a problem hiding this comment.
🧹 Nitpick comments (3)
docs/openshift/NETEDGE-validation-how-to.md (3)
16-23: Consider clarifying path assumptions.The build instructions assume a specific directory layout (
../mcpcheckerrelative to../openshift-mcp-server/). Consider adding a note that users should adjust paths based on their actual clone locations, or use absolute paths in the example.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/openshift/NETEDGE-validation-how-to.md` around lines 16 - 23, The "Build/Install mcpchecker" step assumes a specific relative layout (cloning to ../mcpchecker and moving the binary to ../openshift-mcp-server) which may confuse users; update the documentation section to explicitly state that paths are examples and should be adjusted to match the user's clone locations, and provide an alternative using an explicit destination variable or absolute paths (e.g., mention using a WORKDIR or DEST_DIR variable) so readers understand they can replace ../mcpchecker and ../openshift-mcp-server with their actual paths when running the git clone, go build, and mv commands.
51-77: Simplify redundant environment variable exports.Each test command includes
export RH_GEMINI_API_KEY=$RH_GEMINI_API_KEY &&, which is redundant if the variable is already set (as indicated in Prerequisites). Consider simplifying by either:
- Removing the export from each command (assuming users follow the Prerequisites)
- Adding a single
export RH_GEMINI_API_KEYinstruction before all tests📝 Simplified example
3. **Run the Evaluations**: + Ensure your API key is exported: + ```bash + export RH_GEMINI_API_KEY + ``` + Run `mcpchecker` checks using the Gemini Agent. **Test 1: Get CoreDNS Configuration** ```bash - export RH_GEMINI_API_KEY=$RH_GEMINI_API_KEY && ./mcpchecker check evals/gemini-agent/eval.yaml --run "get-coredns-config" -v + ./mcpchecker check evals/gemini-agent/eval.yaml --run "get-coredns-config" -v ```🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/openshift/NETEDGE-validation-how-to.md` around lines 51 - 77, Remove the redundant per-test "export RH_GEMINI_API_KEY=$RH_GEMINI_API_KEY &&" prefix from the Gemini Agent test commands (the lines that run ./mcpchecker with --run "get-coredns-config", "query-prometheus-ingress", "inspect-route", "get-service-endpoints", "probe-dns-local", and "probe-http") and either add a single "export RH_GEMINI_API_KEY" line once before the test blocks or assume the Prerequisites already set the variable and leave the commands as plain "./mcpchecker ..." invocations; update the code blocks accordingly so each test runs without the repeated export.
51-77: The test name format in the documentation is correct.The
--runparameter values match thename:fields defined in the task.yaml files (e.g.,get-coredns-config,inspect-route,get-service-endpoints), so the test commands will execute as intended.Minor: The
export RH_GEMINI_API_KEY=$RH_GEMINI_API_KEY &&pattern can be simplified to justRH_GEMINI_API_KEY=$RH_GEMINI_API_KEY ./mcpchecker ...if the variable is already in the environment, avoiding the redundant export.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/openshift/NETEDGE-validation-how-to.md` around lines 51 - 77, The commands repeatedly use the redundant prefix `export RH_GEMINI_API_KEY=$RH_GEMINI_API_KEY &&` before invoking ./mcpchecker; change each occurrence so the env var is set inline (e.g., `RH_GEMINI_API_KEY=$RH_GEMINI_API_KEY ./mcpchecker ...`) for Test 1/2/3/4/5/6 lines that call ./mcpchecker with --run values like "get-coredns-config", "query-prometheus-ingress", "inspect-route", "get-service-endpoints", "probe-dns-local", and "probe-http".
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@docs/openshift/NETEDGE-validation-how-to.md`:
- Around line 16-23: The "Build/Install mcpchecker" step assumes a specific
relative layout (cloning to ../mcpchecker and moving the binary to
../openshift-mcp-server) which may confuse users; update the documentation
section to explicitly state that paths are examples and should be adjusted to
match the user's clone locations, and provide an alternative using an explicit
destination variable or absolute paths (e.g., mention using a WORKDIR or
DEST_DIR variable) so readers understand they can replace ../mcpchecker and
../openshift-mcp-server with their actual paths when running the git clone, go
build, and mv commands.
- Around line 51-77: Remove the redundant per-test "export
RH_GEMINI_API_KEY=$RH_GEMINI_API_KEY &&" prefix from the Gemini Agent test
commands (the lines that run ./mcpchecker with --run "get-coredns-config",
"query-prometheus-ingress", "inspect-route", "get-service-endpoints",
"probe-dns-local", and "probe-http") and either add a single "export
RH_GEMINI_API_KEY" line once before the test blocks or assume the Prerequisites
already set the variable and leave the commands as plain "./mcpchecker ..."
invocations; update the code blocks accordingly so each test runs without the
repeated export.
- Around line 51-77: The commands repeatedly use the redundant prefix `export
RH_GEMINI_API_KEY=$RH_GEMINI_API_KEY &&` before invoking ./mcpchecker; change
each occurrence so the env var is set inline (e.g.,
`RH_GEMINI_API_KEY=$RH_GEMINI_API_KEY ./mcpchecker ...`) for Test 1/2/3/4/5/6
lines that call ./mcpchecker with --run values like "get-coredns-config",
"query-prometheus-ingress", "inspect-route", "get-service-endpoints",
"probe-dns-local", and "probe-http".
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 6fa565ce-e1bc-46a1-b858-7c3f0104c08a
📒 Files selected for processing (2)
docs/openshift/NETEDGE-validation-how-to.mdevals/tasks/netedge/probe_http/task.yaml
✅ Files skipped from review due to trivial changes (1)
- evals/tasks/netedge/probe_http/task.yaml
|
@bentito: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: bentito, matzew The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
@bentito: This pull request explicitly references no jira issue. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/override "Konflux kflux-prd-rh02 / openshift-mcp-server-on-pull-request" |
|
@matzew: Overrode contexts on behalf of matzew: Konflux kflux-prd-rh02 / openshift-mcp-server-on-pull-request DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
1596731
into
openshift:main
[release-0.3] NO-JIRA: feat: add remaining netedge evaluations (backport #195)
|
The new netedge tasks are missing |
This PR introduces evaluation tasks for the four remaining netedge tools that were not covered in PR #138.
Changes:
inspect-route,get-service-endpoints,probe-dns-local, andprobe-httptasks toevals/tasks/netedge/to complete the core NIDS diagnostic capabilities test suite.docs/openshift/NETEDGE-validation-how-to.mdto include step-by-step instructions for running these 4 new validations.Verification:
Follow the steps in docs/openshift/NETEDGE-validation-how-to.md to run the evaluations.
inspect-route: Verifies inspecting the default OpenShift console route.get-service-endpoints: Verifies listing endpoint slices forrouter-default.probe-dns-local: Verifies querying cluster DNS via the MCP server host.probe-http: Verifies HTTP reachability to the internal Kubernetes API.Note: Similar to the previous evaluations, these require an OpenShift environment and will fail on Kind due to dependencies on OpenShift-specific CRDs and service discovery.
Summary by CodeRabbit
Documentation
New Features
Chores / Behavior