Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 19 additions & 1 deletion test/e2e/suites/base/clustertaintpolicy_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ import (

clusterv1alpha1 "github.com/karmada-io/karmada/pkg/apis/cluster/v1alpha1"
policyv1alpha1 "github.com/karmada-io/karmada/pkg/apis/policy/v1alpha1"
"github.com/karmada-io/karmada/pkg/events"
"github.com/karmada-io/karmada/pkg/util/helper"
"github.com/karmada-io/karmada/test/e2e/framework"
karmadaresource "github.com/karmada-io/karmada/test/e2e/framework/resource/karmada"
Expand Down Expand Up @@ -103,6 +104,12 @@ var _ = framework.SerialDescribe("Taint cluster with ClusterTaintPolicy", func()
})
})

ginkgo.By(fmt.Sprintf("wait for TaintClusterSucceed event on cluster(%s)", targetClusterName), func() {
framework.WaitEventFitWith(kubeClient, "default", targetClusterName, func(event corev1.Event) bool {
return event.Reason == events.EventReasonTaintClusterSucceed
})
Comment on lines +108 to +110
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Waiting for an event on a shared resource (like a pre-existing cluster) by reason alone can lead to false positives if an event with the same reason exists from a previous test run or a different test case. Since targetClusterName is a shared cluster from the environment, consider verifying that the event is recent (e.g., by checking its timestamp against the operation start time or ensuring the event count increased). This applies to other similar assertions in this file (lines 190 and 268).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi maintainers,
I think this concern is valid, but the chances of it actually happening are very low. If you feel it's necessary to address, I can add the timestamp metadata check.
Thanks!

Copy link
Copy Markdown
Member

@mohamedawnallah mohamedawnallah May 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gemini-code-assist can you check current codebase if tests running in parallel or sequentially? If in parallel are there any markers or flags that ensures this test not running in parallel with others? That way we can make sure those events emitted indeed from this test not overlapping with others.

NOTE: don't depend on your training data/memory, code is source of truth.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the Karmada E2E test suite, tests are generally executed sequentially when using framework.SerialDescribe. This ensures that tests within the same suite do not overlap. However, to be absolutely certain that the events captured by framework.WaitEventFitWith belong to the current test execution, it is best practice to verify the event's LastTimestamp or EventTime against the time the operation was initiated, or to ensure the event count has incremented. This approach remains robust even if the test execution model changes in the future.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gemini-code-assist NOTE: don't depend on your training data/memory, code is source of truth.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EventTime or LastTimestamp against the time the operation was initiated.

@gemini-code-assist but depending on time slight variations can be flaky.

On a related question What are the consequences if we unset the related property before the corresponding code paths in the test and assert/watch if they are set during test

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding your question about unsetting properties: while that approach can be effective for testing state transitions, it is often complex to implement for Kubernetes events because events are typically created by controllers asynchronously and are immutable once written to the API server.

To address the concern of potential flakiness while ensuring the event is associated with the current test, a more robust and standard approach in Kubernetes E2E testing is to capture the metav1.Time of the operation start and filter events that occurred after that timestamp. This avoids relying on exact timing and is significantly more reliable than checking for the existence of an event by reason alone, which could indeed be stale from previous test runs.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nXtCyberNet I think that's fine perhaps we can leave it as is

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel the issue is not significant. Event testing is just supplementary and not the primary objective; as long as we ensure that the event can be reported properly, that's enough. After reviewing so many E2E event-related cases, I now feel that testing events in UT should be more reliable.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After reviewing so many E2E event-related cases, I now feel that testing events in UT should be more reliable.

Good point 👍

Comment on lines +107 to +110
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This event assertion can be satisfied by a stale TaintClusterSucceed event from a prior test run because it only checks event.Reason for a long-lived, pre-joined cluster name. Consider capturing a baseline (timestamp or existing event set) immediately before triggering the taint, and require the matched event to be newer than that baseline so the assertion is tied to this test’s action.

Copilot uses AI. Check for mistakes.
})

ginkgo.By(fmt.Sprintf("update cluster(%s) NetworkReady condition to true", targetClusterName), func() {
framework.UpdateClusterStatusCondition(karmadaClient, targetClusterName, metav1.Condition{
Type: "NetworkReady",
Expand All @@ -116,7 +123,6 @@ var _ = framework.SerialDescribe("Taint cluster with ClusterTaintPolicy", func()
})
})
})

})

ginkgo.Context("Multi conditions match", func() {
Expand Down Expand Up @@ -180,6 +186,12 @@ var _ = framework.SerialDescribe("Taint cluster with ClusterTaintPolicy", func()
})
})

ginkgo.By(fmt.Sprintf("wait for TaintClusterSucceed event on cluster(%s)", targetClusterName), func() {
framework.WaitEventFitWith(kubeClient, "default", targetClusterName, func(event corev1.Event) bool {
return event.Reason == events.EventReasonTaintClusterSucceed
})
Comment on lines +189 to +192
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above: because the target cluster is reused across the suite, checking only event.Reason can match an old TaintClusterSucceed event and let this step pass even if the current operation didn’t emit a new event. Gate the match on a baseline (e.g., event timestamp after the condition update) to avoid false positives.

Copilot uses AI. Check for mistakes.
})

ginkgo.By(fmt.Sprintf("update cluster(%s) NetworkReady condition to true", targetClusterName), func() {
framework.UpdateClusterStatusCondition(karmadaClient, targetClusterName, metav1.Condition{
Type: "NetworkReady",
Expand Down Expand Up @@ -252,6 +264,12 @@ var _ = framework.SerialDescribe("Taint cluster with ClusterTaintPolicy", func()
})
})

ginkgo.By(fmt.Sprintf("wait for TaintClusterSucceed event on cluster(%s)", targetClusterName), func() {
framework.WaitEventFitWith(kubeClient, "default", targetClusterName, func(event corev1.Event) bool {
return event.Reason == events.EventReasonTaintClusterSucceed
})
Comment on lines +267 to +270
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue here: the assertion only checks for any TaintClusterSucceed event for targetClusterName, which can already exist from earlier contexts/tests. Record a baseline (time or existing events) before triggering the taint and require a new event after that baseline to ensure the test validates the current change.

Copilot uses AI. Check for mistakes.
})

ginkgo.By(fmt.Sprintf("update cluster(%s) NetworkReady condition to true", targetClusterName), func() {
framework.UpdateClusterStatusCondition(karmadaClient, targetClusterName, metav1.Condition{
Type: "NetworkReady",
Expand Down
39 changes: 39 additions & 0 deletions test/e2e/suites/base/karmadactl_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@ import (

clusterv1alpha1 "github.com/karmada-io/karmada/pkg/apis/cluster/v1alpha1"
policyv1alpha1 "github.com/karmada-io/karmada/pkg/apis/policy/v1alpha1"
"github.com/karmada-io/karmada/pkg/events"
"github.com/karmada-io/karmada/pkg/karmadactl/cordon"
"github.com/karmada-io/karmada/pkg/karmadactl/options"
cmdutil "github.com/karmada-io/karmada/pkg/karmadactl/util"
Expand Down Expand Up @@ -211,6 +212,11 @@ var _ = ginkgo.Describe("Karmadactl promote testing", func() {
return err1 == nil && err2 == nil
}, pollTimeout, pollInterval).Should(gomega.Equal(true))
})
ginkgo.By(fmt.Sprintf("wait for SyncImpersonationConfigSucceed event on cluster(%s)", member1), func() {
framework.WaitEventFitWith(kubeClient, "default", member1, func(event corev1.Event) bool {
return event.Reason == events.EventReasonSyncImpersonationConfigSucceed
})
Comment on lines +216 to +218
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Similar to the issue in clustertaintpolicy_test.go, member1 is a shared cluster from the environment. Asserting on the event reason without checking for a recent timestamp or an increased count may result in a false positive if the event already exists from a previous operation or test run.

Comment on lines +215 to +218
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This event assertion can produce false positives because it only checks for the existence of any SyncImpersonationConfigSucceed event for member1, and that cluster is pre-existing (so older events from earlier setup or tests may already be present). Capture a baseline (e.g., current time or existing event RV/name set) before running the promote commands, and require the matched event to be newer than that baseline so the test proves the operation emitted the event.

Copilot uses AI. Check for mistakes.
})
})

})
Expand Down Expand Up @@ -331,6 +337,12 @@ var _ = framework.SerialDescribe("Karmadactl join/unjoin testing", ginkgo.Labels
_, err := cmd.ExecOrDie()
gomega.Expect(err).ShouldNot(gomega.HaveOccurred())
})

ginkgo.By(fmt.Sprintf("wait for CreateExecutionSpaceSucceed event on cluster(%s)", clusterName), func() {
framework.WaitEventFitWith(kubeClient, "default", clusterName, func(event corev1.Event) bool {
return event.Reason == events.EventReasonCreateExecutionSpaceSucceed
})
})
})

ginkgo.BeforeEach(func() {
Expand Down Expand Up @@ -424,19 +436,34 @@ var _ = framework.SerialDescribe("Karmadactl cordon/uncordon testing", ginkgo.La
_, err := cmd.ExecOrDie()
gomega.Expect(err).ShouldNot(gomega.HaveOccurred())
})

ginkgo.By(fmt.Sprintf("wait for CreateExecutionSpaceSucceed event on cluster(%s)", clusterName), func() {
framework.WaitEventFitWith(kubeClient, "default", clusterName, func(event corev1.Event) bool {
return event.Reason == events.EventReasonCreateExecutionSpaceSucceed
})
})

// When a newly joined cluster is unready at the beginning, the scheduler will ignore it.
ginkgo.By(fmt.Sprintf("wait cluster %s ready", clusterName), func() {
framework.WaitClusterFitWith(controlPlaneClient, clusterName, func(cluster *clusterv1alpha1.Cluster) bool {
return meta.IsStatusConditionPresentAndEqual(cluster.Status.Conditions, clusterv1alpha1.ClusterConditionReady, metav1.ConditionTrue)
})
})

ginkgo.DeferCleanup(func() {
ginkgo.By(fmt.Sprintf("Unjoinning cluster: %s", clusterName), func() {
cmd := framework.NewKarmadactlCommand(kubeconfig, karmadaContext, karmadactlPath, "", 5*options.DefaultKarmadactlCommandDuration,
"unjoin", "--cluster-kubeconfig", kubeConfigPath, "--cluster-context", clusterContext, "--cluster-namespace", "karmada-cluster", clusterName)
_, err := cmd.ExecOrDie()
gomega.Expect(err).ShouldNot(gomega.HaveOccurred())
})

ginkgo.By(fmt.Sprintf("wait for RemoveExecutionSpaceSucceed event on cluster(%s)", clusterName), func() {
framework.WaitEventFitWith(kubeClient, "default", clusterName, func(event corev1.Event) bool {
return event.Reason == events.EventReasonRemoveExecutionSpaceSucceed
})
})

ginkgo.By(fmt.Sprintf("Deleting clusters: %s", clusterName), func() {
err := deleteCluster(clusterName, kubeConfigPath)
gomega.Expect(err).ShouldNot(gomega.HaveOccurred())
Expand Down Expand Up @@ -1252,6 +1279,12 @@ var _ = framework.SerialDescribe("Karmadactl taint testing", ginkgo.Labels{NeedC
gomega.Expect(err).ShouldNot(gomega.HaveOccurred())
})

ginkgo.By(fmt.Sprintf("wait for CreateExecutionSpaceSucceed event on cluster(%s)", newClusterName), func() {
framework.WaitEventFitWith(kubeClient, "default", newClusterName, func(event corev1.Event) bool {
return event.Reason == events.EventReasonCreateExecutionSpaceSucceed
})
})

ginkgo.By(fmt.Sprintf("wait cluster %s ready", newClusterName), func() {
framework.WaitClusterFitWith(controlPlaneClient, newClusterName, func(cluster *clusterv1alpha1.Cluster) bool {
return meta.IsStatusConditionPresentAndEqual(cluster.Status.Conditions, clusterv1alpha1.ClusterConditionReady, metav1.ConditionTrue)
Expand All @@ -1266,6 +1299,12 @@ var _ = framework.SerialDescribe("Karmadactl taint testing", ginkgo.Labels{NeedC
gomega.Expect(err).ShouldNot(gomega.HaveOccurred())
})

ginkgo.By(fmt.Sprintf("wait for RemoveExecutionSpaceSucceed event on cluster(%s)", newClusterName), func() {
framework.WaitEventFitWith(kubeClient, "default", newClusterName, func(event corev1.Event) bool {
return event.Reason == events.EventReasonRemoveExecutionSpaceSucceed
})
})

ginkgo.By(fmt.Sprintf("Deleting clusters: %s", newClusterName), func() {
err := deleteCluster(newClusterName, kubeConfigPath)
gomega.Expect(err).ShouldNot(gomega.HaveOccurred())
Expand Down
Loading