executor: reduce TestDistSQLSharedKVRequestRace iterations to fix CI timeout by joechenrh · Pull Request #67675 · pingcap/tidb

joechenrh · 2026-04-10T03:05:27Z

What problem does this PR solve?

Issue Number: ref xxx

Problem Summary:
TestDistSQLSharedKVRequestRace frequently times out in CI (pull_unit_test_next_gen). With the race detector enabled, the test runs 5 replica-read modes × 20 iterations × 2 queries = 200 queries on a partitioned table, taking ~278s on CI — dangerously close to the 5-minute Bazel "moderate" timeout. This causes flaky timeouts (example).

What changed and how does it work?

Reduce the inner loop iterations from 20 to 5 (total queries: 200 → 50). The race detector catches data races deterministically on first occurrence, and the RequestBuilder.used safety check (added in #61376) catches any builder-reuse regression even with a single iteration. 5 iterations is more than sufficient for confidence.

Check List

Tests

Unit test
Integration test
Manual test (add detailed scripts or steps below)
No need to test
- I checked and no code files have been changed.

Side effects

Performance regression: Consumes more CPU
Performance regression: Consumes more Memory
Breaking backward compatibility

Documentation

Release note

None

Summary by CodeRabbit

Tests
- Reduced iteration count in a distributed SQL test to improve test run performance while preserving the same SQL checks and validation logic.

…timeout The test runs 5 replica-read modes × 20 iterations × 2 queries = 200 queries with the race detector enabled. On CI this takes ~278s, barely under the 5-minute Bazel "moderate" timeout, causing flaky timeouts. Reduce iterations from 20 to 5. The race detector catches data races deterministically on first occurrence, and the RequestBuilder.used safety check catches any reuse regression even with a single iteration. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

pantheon-ai · 2026-04-10T03:05:32Z

Review Complete

Findings: 0 issues
Posted: 0
Duplicates/Skipped: 0

_{ℹ️ Learn more details on Pantheon AI.}

ti-chi-bot · 2026-04-10T03:05:34Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign dveeden for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

coderabbitai · 2026-04-10T03:05:52Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 6914c4df-f924-4439-b86b-0e456ddf72ef

📥 Commits

Reviewing files that changed from the base of the PR and between e70180b and 39c3ec6.

📒 Files selected for processing (1)

pkg/executor/test/distsqltest/distsql_test.go

🚧 Files skipped from review as they are similar to previous changes (1)

pkg/executor/test/distsqltest/distsql_test.go

📝 Walkthrough

Walkthrough

Reduced the iteration count in TestDistSQLSharedKVRequestRace from 20 to 5; test SQL statements and assertions remain unchanged.

Changes

Cohort / File(s)	Summary
Test Optimization `pkg/executor/test/distsqltest/distsql_test.go`	Lowered inner loop iterations in `TestDistSQLSharedKVRequestRace` from 20 to 5, reducing total query executions across replica read modes while keeping queries and checks identical.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

Suggested labels

ok-to-test, approved, lgtm

Suggested reviewers

solotzg
gengliqi
wjhuang2016

Poem

🐰 Five hops now where twenty sprang,
Quiet paws and nimble wing,
The queries still bloom, assertions stay,
Faster runs through fields of May,
A rabbit cheers—small change, big zing! 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and concisely describes the main change: reducing test iterations to fix a CI timeout issue.
Description check	✅ Passed	The description comprehensively addresses the template requirements with clear problem statement, detailed solution explanation, and all checklist items properly addressed.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.11.4)

Command failed

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

tiprow · 2026-04-10T03:10:12Z

Hi @joechenrh. Thanks for your PR.

PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

pantheon-ai

✅ Code looks good. No issues found.

codecov · 2026-04-10T03:23:40Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 77.5996%. Comparing base (c2c1342) to head (39c3ec6).
⚠️ Report is 86 commits behind head on master.

Additional details and impacted files

@@               Coverage Diff                @@
##             master     #67675        +/-   ##
================================================
- Coverage   77.8173%   77.5996%   -0.2178%     
================================================
  Files          2023       1965        -58     
  Lines        556183     557000       +817     
================================================
- Hits         432807     432230       -577     
- Misses       121632     124739      +3107     
+ Partials       1744         31      -1713

Flag	Coverage Δ
integration	`40.9370% <ø> (-7.1898%)`	⬇️
unit	`76.6496% <ø> (+0.2799%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
dumpling	`61.5065% <ø> (ø)`
parser	`∅ <ø> (∅)`
br	`50.0915% <ø> (-10.7722%)`	⬇️

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

ingress-bot · 2026-04-10T03:26:12Z

🔍 Starting code review for this PR...

ingress-bot

This review was generated by AI and should be verified by a human reviewer.
Manual follow-up is recommended before merge.

Summary

Total findings: 3
Inline comments: 3
Summary-only findings (no inline anchor): 0

Findings (highest risk first)

🟡 [Minor] (3)

Race regression test lost replay coverage after loop-count reduction (pkg/executor/test/distsqltest/distsql_test.go:138)
Race reproducer coverage is reduced by cutting loop repetitions (pkg/executor/test/distsqltest/distsql_test.go:138)
Race-check loop bound changed without intent documentation (pkg/executor/test/distsqltest/distsql_test.go:138)

ingress-bot · 2026-04-10T03:32:44Z

pkg/executor/test/distsqltest/distsql_test.go

 	for _, mode := range replicaReadModes {
 		tk.MustExec(fmt.Sprintf("set session tidb_replica_read = '%s'", mode))
-		for i := 0; i < 20; i++ {
+		for i := 0; i < 5; i++ {


🟡 [Minor] Race regression test lost replay coverage after loop-count reduction

Impact
TestDistSQLSharedKVRequestRace now runs each replica-read mode 5 times instead of 20, reducing stress executions from 200 to 50 across the two query paths.
This reduces replay/retry sampling for a schedule-sensitive race regression, so repeated runs no longer provide the prior detection confidence.

Scope

pkg/executor/test/distsqltest/distsql_test.go:138 — TestDistSQLSharedKVRequestRace

Evidence
The changed loop bound is for i := 0; i < 5; i++, replacing the previous 20-iteration stress loop in TestDistSQLSharedKVRequestRace.
That loop wraps both force index(ic) and index-merge query checks for every tidb_replica_read mode, so each mode now executes only one quarter of the previous repetition count.

Change request
Restore the stress loop count to the previous level, or introduce an explicit deterministic stress knob with documented rationale for lower coverage.
Keep per-mode repeated executions high enough that race detection remains stable across reruns and scheduler variance.

ingress-bot · 2026-04-10T03:32:44Z

pkg/executor/test/distsqltest/distsql_test.go

 	for _, mode := range replicaReadModes {
 		tk.MustExec(fmt.Sprintf("set session tidb_replica_read = '%s'", mode))
-		for i := 0; i < 20; i++ {
+		for i := 0; i < 5; i++ {


🟡 [Minor] Race reproducer coverage is reduced by cutting loop repetitions

Impact
TestDistSQLSharedKVRequestRace is the regression guard for shared kv.Request race behavior from issue 60175, and this patch cuts repeated executions from 20 to 5 per replica-read mode.
The reduced repetition lowers scheduler interleaving coverage, allowing low-frequency race regressions to pass this guard.

Scope

pkg/executor/test/distsqltest/distsql_test.go:138 — TestDistSQLSharedKVRequestRace

Evidence
In TestDistSQLSharedKVRequestRace, the inner loop now runs 5 iterations instead of 20 while executing the same two query paths each round.
The function comment marks this test as the regression check for https://github.com/pingcap/tidb/issues/60175, so reducing only the repetition count removes stress coverage without adding a deterministic trigger.

Change request
Restore the previous repetition budget or replace it with a deterministic concurrency trigger that guarantees the race window is exercised on every run.
If test time is the concern, keep a fast-path count here only with an additional stress variant that preserves equivalent race-detection strength in CI.

ingress-bot · 2026-04-10T03:32:44Z

pkg/executor/test/distsqltest/distsql_test.go

 	for _, mode := range replicaReadModes {
 		tk.MustExec(fmt.Sprintf("set session tidb_replica_read = '%s'", mode))
-		for i := 0; i < 20; i++ {
+		for i := 0; i < 5; i++ {


🟡 [Minor] Race-check loop bound changed without intent documentation

Impact
TestDistSQLSharedKVRequestRace is explicitly tied to issue 60175, but the new 5 iteration bound is an unexplained magic number in a stress-style test.
Without rationale for why this bound is sufficient, later edits can keep shrinking or reshaping the probe while the test name still implies strong race-coverage intent.

Scope

pkg/executor/test/distsqltest/distsql_test.go:138 — TestDistSQLSharedKVRequestRace

Evidence
The diff changes the inner loop from for i := 0; i < 20; i++ to for i := 0; i < 5; i++ at line 138.
The nearby comments only label query forms (index lookup and index merge) and do not document the expected repetition invariant or the tradeoff behind the new bound.

Change request
Introduce an intent-revealing constant name for this loop bound and add a short comment explaining why the chosen count is sufficient for the 60175 regression guard.
Document the accepted coverage/performance tradeoff at this line so future maintainers can adjust it without guesswork.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

ti-chi-bot · 2026-04-10T05:05:28Z

[FORMAT CHECKER NOTIFICATION]

Notice: To remove the do-not-merge/needs-linked-issue label, please provide the linked issue number on one line in the PR body, for example: Issue Number: close #123 or Issue Number: ref #456.

_{📖 For more info, you can check the "Contribute Code" section in the development guide.}

ti-chi-bot · 2026-04-10T05:21:53Z

@joechenrh: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
pull-integration-realcluster-test-next-gen	`39c3ec6`	link	true	`/test pull-integration-realcluster-test-next-gen`

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

ti-chi-bot bot added the release-note-none Denotes a PR that doesn't merit a release note. label Apr 10, 2026

ti-chi-bot bot added needs-cherry-pick-release-8.1 Should cherry pick this PR to release-8.1 branch. needs-cherry-pick-release-7.1 Should cherry pick this PR to release-7.1 branch. labels Apr 10, 2026

ti-chi-bot bot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Apr 10, 2026

ti-chi-bot bot added the do-not-merge/needs-linked-issue label Apr 10, 2026

pantheon-ai bot reviewed Apr 10, 2026

View reviewed changes

ingress-bot reviewed Apr 10, 2026

View reviewed changes

joechenrh removed needs-cherry-pick-release-7.1 Should cherry pick this PR to release-7.1 branch. needs-cherry-pick-release-8.1 Should cherry pick this PR to release-8.1 branch. labels Apr 10, 2026

joechenrh and others added 2 commits April 10, 2026 12:44

executor: add comment explaining iteration count rationale

379949c

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

executor: use for range idiom

39c3ec6

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Conversation

joechenrh commented Apr 10, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What problem does this PR solve?

What changed and how does it work?

Check List

Release note

Summary by CodeRabbit

Uh oh!

pantheon-ai bot commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ti-chi-bot bot commented Apr 10, 2026

Uh oh!

coderabbitai bot commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

tiprow bot commented Apr 10, 2026

Uh oh!

pantheon-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

ingress-bot commented Apr 10, 2026

Uh oh!

ingress-bot left a comment

Choose a reason for hiding this comment

Summary

🟡 [Minor] (3)

Uh oh!

ingress-bot Apr 10, 2026

Choose a reason for hiding this comment

🟡 [Minor] Race regression test lost replay coverage after loop-count reduction

Uh oh!

ingress-bot Apr 10, 2026

Choose a reason for hiding this comment

🟡 [Minor] Race reproducer coverage is reduced by cutting loop repetitions

Uh oh!

ingress-bot Apr 10, 2026

Choose a reason for hiding this comment

🟡 [Minor] Race-check loop bound changed without intent documentation

Uh oh!

ti-chi-bot bot commented Apr 10, 2026

Uh oh!

ti-chi-bot bot commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

joechenrh commented Apr 10, 2026 •

edited by coderabbitai bot

Loading

pantheon-ai bot commented Apr 10, 2026 •

edited

Loading

coderabbitai bot commented Apr 10, 2026 •

edited

Loading

codecov bot commented Apr 10, 2026 •

edited

Loading

ti-chi-bot bot commented Apr 10, 2026 •

edited

Loading