test: add inline event assertions for cluster object events#7399
test: add inline event assertions for cluster object events#7399karmada-bot merged 1 commit intokarmada-io:masterfrom
Conversation
Signed-off-by: nXtCyberNet <rohantech2005@gmail.com>
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request enhances the end-to-end testing framework for Karmada by introducing inline assertions for critical cluster object events. By verifying these events immediately after their triggering operations, the PR improves the reliability and coverage of the control plane's event emission logic, specifically targeting successful cluster lifecycle and configuration operations. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request enhances E2E tests for ClusterTaintPolicy and karmadactl by adding explicit waits for success events, such as TaintClusterSucceed and CreateExecutionSpaceSucceed, to ensure operations are fully processed. Feedback was provided regarding the potential for false positives when waiting for events on shared clusters by reason alone; it is recommended to incorporate checks for event timestamps or counts to ensure the events captured are relevant to the current test execution.
| framework.WaitEventFitWith(kubeClient, "default", targetClusterName, func(event corev1.Event) bool { | ||
| return event.Reason == events.EventReasonTaintClusterSucceed | ||
| }) |
There was a problem hiding this comment.
Waiting for an event on a shared resource (like a pre-existing cluster) by reason alone can lead to false positives if an event with the same reason exists from a previous test run or a different test case. Since targetClusterName is a shared cluster from the environment, consider verifying that the event is recent (e.g., by checking its timestamp against the operation start time or ensuring the event count increased). This applies to other similar assertions in this file (lines 190 and 268).
There was a problem hiding this comment.
Hi maintainers,
I think this concern is valid, but the chances of it actually happening are very low. If you feel it's necessary to address, I can add the timestamp metadata check.
Thanks!
There was a problem hiding this comment.
@gemini-code-assist can you check current codebase if tests running in parallel or sequentially? If in parallel are there any markers or flags that ensures this test not running in parallel with others? That way we can make sure those events emitted indeed from this test not overlapping with others.
NOTE: don't depend on your training data/memory, code is source of truth.
There was a problem hiding this comment.
In the Karmada E2E test suite, tests are generally executed sequentially when using framework.SerialDescribe. This ensures that tests within the same suite do not overlap. However, to be absolutely certain that the events captured by framework.WaitEventFitWith belong to the current test execution, it is best practice to verify the event's LastTimestamp or EventTime against the time the operation was initiated, or to ensure the event count has incremented. This approach remains robust even if the test execution model changes in the future.
There was a problem hiding this comment.
@gemini-code-assist NOTE: don't depend on your training data/memory, code is source of truth.
There was a problem hiding this comment.
EventTime or LastTimestamp against the time the operation was initiated.
@gemini-code-assist but depending on time slight variations can be flaky.
On a related question What are the consequences if we unset the related property before the corresponding code paths in the test and assert/watch if they are set during test
There was a problem hiding this comment.
Regarding your question about unsetting properties: while that approach can be effective for testing state transitions, it is often complex to implement for Kubernetes events because events are typically created by controllers asynchronously and are immutable once written to the API server.
To address the concern of potential flakiness while ensuring the event is associated with the current test, a more robust and standard approach in Kubernetes E2E testing is to capture the metav1.Time of the operation start and filter events that occurred after that timestamp. This avoids relying on exact timing and is significantly more reliable than checking for the existence of an event by reason alone, which could indeed be stale from previous test runs.
There was a problem hiding this comment.
@nXtCyberNet I think that's fine perhaps we can leave it as is
There was a problem hiding this comment.
I feel the issue is not significant. Event testing is just supplementary and not the primary objective; as long as we ensure that the event can be reported properly, that's enough. After reviewing so many E2E event-related cases, I now feel that testing events in UT should be more reliable.
There was a problem hiding this comment.
After reviewing so many E2E event-related cases, I now feel that testing events in UT should be more reliable.
Good point 👍
| framework.WaitEventFitWith(kubeClient, "default", member1, func(event corev1.Event) bool { | ||
| return event.Reason == events.EventReasonSyncImpersonationConfigSucceed | ||
| }) |
There was a problem hiding this comment.
There was a problem hiding this comment.
Pull request overview
Adds inline E2E assertions that validate Karmada control-plane controllers emit specific cluster-related Kubernetes Events after key lifecycle operations, improving event test coverage.
Changes:
- Add
WaitEventFitWithassertions forSyncImpersonationConfigSucceedafter cluster resource promotion. - Add
WaitEventFitWithassertions forCreateExecutionSpaceSucceed/RemoveExecutionSpaceSucceedaround cluster join/unjoin flows. - Add
WaitEventFitWithassertions forTaintClusterSucceedafter ClusterTaintPolicy-driven taint add/remove.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| test/e2e/suites/base/karmadactl_test.go | Adds event assertions around promote and join/unjoin-related flows. |
| test/e2e/suites/base/clustertaintpolicy_test.go | Adds event assertions for taint-controller emitted cluster events. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| ginkgo.By(fmt.Sprintf("wait for TaintClusterSucceed event on cluster(%s)", targetClusterName), func() { | ||
| framework.WaitEventFitWith(kubeClient, "default", targetClusterName, func(event corev1.Event) bool { | ||
| return event.Reason == events.EventReasonTaintClusterSucceed | ||
| }) |
There was a problem hiding this comment.
This event assertion can be satisfied by a stale TaintClusterSucceed event from a prior test run because it only checks event.Reason for a long-lived, pre-joined cluster name. Consider capturing a baseline (timestamp or existing event set) immediately before triggering the taint, and require the matched event to be newer than that baseline so the assertion is tied to this test’s action.
| ginkgo.By(fmt.Sprintf("wait for TaintClusterSucceed event on cluster(%s)", targetClusterName), func() { | ||
| framework.WaitEventFitWith(kubeClient, "default", targetClusterName, func(event corev1.Event) bool { | ||
| return event.Reason == events.EventReasonTaintClusterSucceed | ||
| }) |
There was a problem hiding this comment.
Same as above: because the target cluster is reused across the suite, checking only event.Reason can match an old TaintClusterSucceed event and let this step pass even if the current operation didn’t emit a new event. Gate the match on a baseline (e.g., event timestamp after the condition update) to avoid false positives.
| ginkgo.By(fmt.Sprintf("wait for TaintClusterSucceed event on cluster(%s)", targetClusterName), func() { | ||
| framework.WaitEventFitWith(kubeClient, "default", targetClusterName, func(event corev1.Event) bool { | ||
| return event.Reason == events.EventReasonTaintClusterSucceed | ||
| }) |
There was a problem hiding this comment.
Same issue here: the assertion only checks for any TaintClusterSucceed event for targetClusterName, which can already exist from earlier contexts/tests. Record a baseline (time or existing events) before triggering the taint and require a new event after that baseline to ensure the test validates the current change.
| ginkgo.By(fmt.Sprintf("wait for SyncImpersonationConfigSucceed event on cluster(%s)", member1), func() { | ||
| framework.WaitEventFitWith(kubeClient, "default", member1, func(event corev1.Event) bool { | ||
| return event.Reason == events.EventReasonSyncImpersonationConfigSucceed | ||
| }) |
There was a problem hiding this comment.
This event assertion can produce false positives because it only checks for the existence of any SyncImpersonationConfigSucceed event for member1, and that cluster is pre-existing (so older events from earlier setup or tests may already be present). Capture a baseline (e.g., current time or existing event RV/name set) before running the promote commands, and require the matched event to be newer than that baseline so the test proves the operation emitted the event.
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #7399 +/- ##
=======================================
Coverage 42.16% 42.17%
=======================================
Files 875 875
Lines 53603 53618 +15
=======================================
+ Hits 22602 22613 +11
- Misses 29304 29307 +3
- Partials 1697 1698 +1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
any updates? |
| // The following tests involve adding NoSchedule and NoExecute effect taints to clusters | ||
| // in the test environment, which will impact the resource scheduling of other concurrently | ||
| // executed test cases. Therefore, these tests need to be configured to run serially. | ||
| var _ = framework.SerialDescribe("Taint cluster with ClusterTaintPolicy", func() { |
There was a problem hiding this comment.
The test is running sequentially 👍
mohamedawnallah
left a comment
There was a problem hiding this comment.
Thanks @nXtCyberNet for the PR, LGTM!
|
Just letting @Vacant2333, @XiShanYongYe-Chang if they have any comments so we can try to get this PR in |
@mohamedawnallah Thanks for review. |
XiShanYongYe-Chang
left a comment
There was a problem hiding this comment.
Thanks @nXtCyberNet @mohamedawnallah
/approve
Sorry for the late review; I just got back from vacation.
Leave lgtm to @mohamedawnallah
| framework.WaitEventFitWith(kubeClient, "default", targetClusterName, func(event corev1.Event) bool { | ||
| return event.Reason == events.EventReasonTaintClusterSucceed | ||
| }) |
There was a problem hiding this comment.
I feel the issue is not significant. Event testing is just supplementary and not the primary objective; as long as we ensure that the event can be reported properly, that's enough. After reviewing so many E2E event-related cases, I now feel that testing events in UT should be more reliable.
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: mohamedawnallah, XiShanYongYe-Chang The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
Hi @mohamedawnallah, I would like to invite you to be a reviewer for the E2E module. Would you be willing to do so? |
Sorry, I checked, and it has already been added :). You can directly type /lgtm to add this label. |
|
/lgtm |
What type of PR is this?
/kind feature
What this PR does / why we need it:
Adds inline E2E test assertions for cluster object events emitted by Karmada
controllers, as part of improving the overall event test coverage (~7.8% → higher).
The following events are now covered:
CreateExecutionSpaceSucceedkarmadactl_test.goRemoveExecutionSpaceSucceedkarmadactl_test.goTaintClusterSucceedclustertaintpolicy_test.goSyncImpersonationConfigSucceedkarmadactl_test.goFailed variants (
*Failed) are skipped for now as they require artificiallybreaking the environment, which is not practical in E2E as noted in the issue.
Assertions follow Approach 2 (inline) as decided in the community meeting —
placed immediately after the operation that triggers the event, reusing the
existing resource lifecycle with no additional setup cost.
Which issue(s) this PR fixes:
Part of #7252
Special notes for your reviewer:
controlPlaneClientsince cluster events are emitted bythe Karmada control plane, not member clusters
defaultnamespace as cluster-scoped object eventsland there
"some failed events may not be easy to construct for testing, so they can
be ignored for now"
Does this PR introduce a user-facing change?: