Skip to content

br/pkg/streamhelper: stabilize flaky TestBlocked#67866

Open
flaky-claw wants to merge 1 commit intopingcap:masterfrom
flaky-claw:flakyfixer/case_6cf876adf39e-a6
Open

br/pkg/streamhelper: stabilize flaky TestBlocked#67866
flaky-claw wants to merge 1 commit intopingcap:masterfrom
flaky-claw:flakyfixer/case_6cf876adf39e-a6

Conversation

@flaky-claw
Copy link
Copy Markdown
Contributor

@flaky-claw flaky-claw commented Apr 17, 2026

What problem does this PR solve?

Issue Number: close #67800

Problem Summary:
Flaky test TestBlocked in br/pkg/streamhelper intermittently fails, so this PR stabilizes that path.

What changed and how does it work?

Root Cause

TestBlocked relied on a timing race-prone assumption; timeout could fire before any blocked checkpoint RPC was actually pinned, so OnTick sometimes returned nil.

Fix

Synchronizing on first blocked-hook entry and using a manageable tick timeout removes the race while keeping the same blocked-RPC timeout assertion.

Verification

Spec:

  • target: br/pkg/streamhelper :: TestBlocked
  • strategy: tidb.go_flaky.default
  • plan mode: BASELINE_ONLY
  • requirements: required case must execute; no skip; repeat count = 1
  • baseline gates: required_flaky_gate, build_safety_gate, intent_guard_gate

Observed result:

  • status: passed
  • required case executed: yes
  • submission decision: ALLOWED
  • scope debt present: yes

Gate checklist:

  • Required flaky gate: PASS
  • Build safety gate: PASS
  • Intent guard gate: PASS
  • Repo-wide advisory gate: SKIPPED
  • Feedback specific gate: SKIPPED

Commands:

  • go test -json ./br/pkg/streamhelper -run '^TestBlocked$' -count=1
  • go test -json ./br/pkg/streamhelper -count=1
  • make build

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No need to test
    • I checked and no code files have been changed.

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

Fixes #67800

Summary by CodeRabbit

Release Notes

No user-facing changes in this release. This update includes internal testing infrastructure improvements to enhance test reliability and diagnostics.

@ti-chi-bot ti-chi-bot Bot added release-note-none Denotes a PR that doesn't merit a release note. do-not-merge/needs-triage-completed labels Apr 17, 2026
@pantheon-ai
Copy link
Copy Markdown

pantheon-ai Bot commented Apr 17, 2026

@flaky-claw I've received your pull request and will start the review. I'll conduct a thorough review covering code quality, potential issues, and implementation details.

⏳ This process typically takes 10-30 minutes depending on the complexity of the changes.

ℹ️ Learn more details on Pantheon AI.

@ti-chi-bot ti-chi-bot Bot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Apr 17, 2026
@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot Bot commented Apr 17, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign 3pointer for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@tiprow
Copy link
Copy Markdown

tiprow Bot commented Apr 17, 2026

Hi @flaky-claw. Thanks for your PR.

PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 17, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 54ef9829-2c58-4ce9-b6b3-a753f1a839dc

📥 Commits

Reviewing files that changed from the base of the PR and between e3f45e4 and 15feb7d.

📒 Files selected for processing (1)
  • br/pkg/streamhelper/advancer_test.go

📝 Walkthrough

Walkthrough

This pull request addresses a flaky test in TestBlocked by introducing deterministic control over blocking behavior through a managed channel (blockReq), adding synchronization primitives (firstBlocked/sync.Once), and increasing timeout thresholds to allow proper observation of blocked RPC behavior before assertion.

Changes

Cohort / File(s) Summary
Test Flakiness Fix
br/pkg/streamhelper/advancer_test.go
Replaced nil channel receive with controlled blockReq channel to deterministically block region-checkpoint RPCs. Added firstBlocked/sync.Once synchronization to wait for blocking to start. Moved tick invocation to buffered goroutine with errCh. Increased tick timeout from 10ms to 100ms and adjusted time bounds to prevent premature timeout.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Suggested labels

contribution, size/M, ok-to-test, component/test, component/br, approved, lgtm

Suggested reviewers

  • YuJuncen
  • Leavrth

Poem

🐰 A flaky test once caused us despair,
With timing too tight and conditions unfair,
But now with blockReq and sync.Once in place,
The blocking behavior flows at a steadier pace,
Deterministic and calm, no more midnight scares! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'br/pkg/streamhelper: stabilize flaky TestBlocked' accurately and concisely describes the main change—fixing test flakiness by stabilizing the TestBlocked test.
Description check ✅ Passed The PR description includes the required structure: issue number (close #67800), problem summary explaining the root cause, detailed explanation of changes and verification results, and properly filled checklist with Unit test selected and release note set to None.
Linked Issues check ✅ Passed The PR directly addresses issue #67800 by eliminating the timing race in TestBlocked through synchronization on the first blocked-hook entry and manageable tick timeout, meeting all coding objectives from the linked issue.
Out of Scope Changes check ✅ Passed All changes are scoped to the test file br/pkg/streamhelper/advancer_test.go with modifications focused solely on stabilizing TestBlocked; no unrelated changes or out-of-scope modifications are present.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 18, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 78.6420%. Comparing base (e3f45e4) to head (15feb7d).
⚠️ Report is 71 commits behind head on master.

Additional details and impacted files
@@               Coverage Diff                @@
##             master     #67866        +/-   ##
================================================
+ Coverage   77.7969%   78.6420%   +0.8450%     
================================================
  Files          1983       1984         +1     
  Lines        548948     549103       +155     
================================================
+ Hits         427065     431826      +4761     
+ Misses       120962     116257      -4705     
- Partials        921       1020        +99     
Flag Coverage Δ
integration 44.2894% <ø> (+4.4922%) ⬆️
unit 76.6541% <ø> (+0.3045%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
dumpling 61.5065% <ø> (ø)
parser ∅ <ø> (∅)
br 66.0459% <ø> (+2.9343%) ⬆️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@yinsustart
Copy link
Copy Markdown

/retest

@tiprow
Copy link
Copy Markdown

tiprow Bot commented Apr 28, 2026

@yinsustart: PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test.

Details

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@yinsustart
Copy link
Copy Markdown

/check-issue-triage-complete

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-note-none Denotes a PR that doesn't merit a release note. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Flaky test: TestBlocked in br/pkg/streamhelper

2 participants