ci: run integration tests in 2 batches by olebel · Pull Request #391 · akash-network/provider

olebel · 2026-04-16T03:48:48Z

Summary

Split E2E integration tests into two parallel batches to reduce overall CI wall time.

Previously, all 12 test suites ran sequentially in a single TestIntegrationTestSuite function. Each suite performs a full stack setup (in-process testnet, provider, operators) and teardown independently, with no ordering dependencies between suites — making them safe to split.

Changes

integration/e2e_test.go — added TestIntegrationBatch1 and TestIntegrationBatch2 alongside the existing TestIntegrationTestSuite; removed duplicate E2EPersistentStorageDefault entry that was previously run twice
- Batch 1:
  - E2EContainerToContainer
  - E2EAppNodePort
  - E2EDeploymentUpdate
  - E2EApp
  - E2EMigrateHostname
- Batch 2:
  - E2EPersistentStorageDefault
  - E2EPersistentStorageBeta2
  - E2EPersistentStorageDeploymentUpdate
  - E2EStorageClassRam
  - E2ECustomCurrency
  - E2EIPAddress
make/test-integration.mk — added test-e2e-integration-batch1 and test-e2e-integration-batch2 targets
.github/workflows/integration-tests.yaml — added e2e_target input (default: test-e2e-integration) to make the reusable workflow target-parametric
.github/workflows/tests.yaml — replaced single integration-tests job with two parallel jobs (integration-tests-batch1, integration-tests-batch2), each passing the respective make target; both are required for dispatch-release

Expected impact

Each parallel job runs its own Kind cluster (full isolation, no shared k8s state). Aims to reduce overall integration tests CI time.

coderabbitai · 2026-04-16T03:48:59Z

Walkthrough

The pull request splits E2E integration tests into two parallel batches by introducing a configurable workflow input parameter, creating separate GitHub Actions jobs for each batch, adding corresponding Go test functions, and defining new Make targets to orchestrate the batch executions.

Changes

Cohort / File(s)	Summary
GitHub Actions Workflow Configuration `.github/workflows/integration-tests.yaml`, `.github/workflows/tests.yaml`	Made the reusable integration test workflow parameterizable with an `e2e_target` input. Split the single `integration-tests` job into two separate batch jobs (`integration-tests-batch1`, `integration-tests-batch2`), each passing a distinct Make target to the reusable workflow. Updated job dependencies in `dispatch-release` to await both batch jobs.
E2E Test Suite `integration/e2e_test.go`	Added `TestIntegrationBatch1` and `TestIntegrationBatch2` functions that partition existing E2E test suites into two subsets: Batch1 runs container, node port, deployment, and app tests; Batch2 runs persistent storage, custom currency, and IP address tests.
Build Targets `make/test-integration.mk`	Added `.PHONY` targets `test-e2e-integration-batch1` and `test-e2e-integration-batch2`, each invoking Go tests with the `e2e` build tag, `$(INTEGRATION_VARS)`, `$(KIND_VARS)`, a 1500-second timeout, and targeting the respective batch test functions.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 The tests hop into two swift lanes,
Batches racing through their domains,
Workflows reused with inputs so fine,
Suites divided, efficiency divine,
Parallel glory in the CI-CD vine! 🚀

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'ci: run integration tests in 2 batches' directly and clearly summarizes the main change: splitting E2E integration tests into parallel batches to reduce CI wall time.
Description check	✅ Passed	The description provides comprehensive context about why the change is needed, how tests are split, lists specific test suites in each batch, and explains the expected impact on CI performance.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch ob/test

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/integration-tests.yaml:
- Line 153: The workflow runs user input unsafely via the line containing make
${{ inputs.e2e_target }}, so validate and quote the e2e_target input before
executing: constrain inputs.e2e_target to an allowlist or a strict regex (e.g.,
only letters/numbers/[-_]) and then invoke make with the validated, quoted value
(use the inputs.e2e_target identifier) so the shell executes a single safe token
instead of allowing command injection.

In @.github/workflows/tests.yaml:
- Around line 107-116: The reusable workflow hardcodes KIND_NAME causing
parallel batch collision; add a new input named kind_name to the reusable
workflow (default "kube") and update the crd-e2e job to use this input instead
of the hardcoded KIND_NAME, then pass unique values from the callers
(integration-tests-batch1 and integration-tests-batch2) via the with: block
(e.g., kind_name: kube-batch1 and kind_name: kube-batch2) so each batch uses a
distinct Kind cluster name.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 6bcd0b81-c830-4352-be1f-e3396f76e514

📥 Commits

Reviewing files that changed from the base of the PR and between a785772 and 22e45dd.

📒 Files selected for processing (4)

.github/workflows/integration-tests.yaml
.github/workflows/tests.yaml
integration/e2e_test.go
make/test-integration.mk

coderabbitai · 2026-04-16T03:55:16Z

        working-directory: ${{ env.GOPATH }}/src/github.com/akash-network/provider
        run: |
-          make test-e2e-integration
+          make ${{ inputs.e2e_target }}


⚠️ Potential issue | 🟠 Major

Validate and quote e2e_target before shell execution.

Line [153] executes a raw workflow input in bash. That allows command injection if an unexpected value is passed via workflow_call.

🔒 Proposed fix

- - name: Run E2E Tests - working-directory: ${{ env.GOPATH }}/src/github.com/akash-network/provider - run: | - make ${{ inputs.e2e_target }} + - name: Run E2E Tests + working-directory: ${{ env.GOPATH }}/src/github.com/akash-network/provider + run: | + target='${{ inputs.e2e_target }}' + case "$target" in + test-e2e-integration|test-e2e-integration-batch1|test-e2e-integration-batch2) ;; + *) echo "invalid e2e_target: $target" >&2; exit 1 ;; + esac + make "$target"

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In @.github/workflows/integration-tests.yaml at line 153, The workflow runs user input unsafely via the line containing make ${{ inputs.e2e_target }}, so validate and quote the e2e_target input before executing: constrain inputs.e2e_target to an allowlist or a strict regex (e.g., only letters/numbers/[-_]) and then invoke make with the validated, quoted value (use the inputs.e2e_target identifier) so the shell executes a single safe token instead of allowing command injection.

coderabbitai · 2026-04-16T03:55:16Z

+  integration-tests-batch1:
    uses: ./.github/workflows/integration-tests.yaml
+    with:
+      e2e_target: test-e2e-integration-batch1
+
+  integration-tests-batch2:
+    uses: ./.github/workflows/integration-tests.yaml
+    with:
+      e2e_target: test-e2e-integration-batch2



⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# Find and inspect the integration-tests.yaml workflow find .github -name "*.yaml" -o -name "*.yml" | head -20

Repository: akash-network/provider

Length of output: 314

🏁 Script executed:

# Check if integration-tests.yaml exists and read relevant sections if [ -f ".github/workflows/integration-tests.yaml" ]; then echo "=== integration-tests.yaml exists ===" wc -l ".github/workflows/integration-tests.yaml" echo "" echo "=== Lines around 24 (KIND_NAME) ===" sed -n '15,35p' ".github/workflows/integration-tests.yaml" echo "" echo "=== Inputs section ===" rg -A 20 "^inputs:" ".github/workflows/integration-tests.yaml" | head -40 fi

Repository: akash-network/provider

Length of output: 678

🏁 Script executed:

# Check the tests.yaml file for how it calls the reusable workflow if [ -f ".github/workflows/tests.yaml" ]; then echo "=== Lines 107-116 from tests.yaml ===" sed -n '107,116p' ".github/workflows/tests.yaml" echo "" echo "=== Full context of integration-tests calls ===" sed -n '100,140p' ".github/workflows/tests.yaml" fi

Repository: akash-network/provider

Length of output: 1395

🏁 Script executed:

# Search for any kind_name or KIND_NAME usage across workflows echo "=== Searching for kind_name or KIND_NAME in workflows ===" rg -i "kind.?name" .github/workflows/ -A 2 -B 2

Repository: akash-network/provider

Length of output: 904

🏁 Script executed:

# Get full structure of integration-tests.yaml to understand jobs echo "=== Full integration-tests.yaml with job names ===" cat -n ".github/workflows/integration-tests.yaml" | grep -E "^\s*[0-9]+\s+(jobs:|[a-z-]+:)" | head -40

Repository: akash-network/provider

Length of output: 1633

🏁 Script executed:

# Get the inputs section and full job list echo "=== Integration-tests.yaml - inputs and jobs section ===" sed -n '1,80p' ".github/workflows/integration-tests.yaml" | cat -n

Repository: akash-network/provider

Length of output: 3486

🏁 Script executed:

# Search for all job definitions and their conditions in integration-tests.yaml echo "=== All jobs in integration-tests.yaml ===" rg "^ [a-z-]+:" ".github/workflows/integration-tests.yaml" -A 5 | head -80

Repository: akash-network/provider

Length of output: 158

🏁 Script executed:

# Get complete job names from integration-tests.yaml echo "=== All job definitions in integration-tests.yaml ===" rg "^ [a-z0-9-]+:$" ".github/workflows/integration-tests.yaml"

Repository: akash-network/provider

Length of output: 137

🏁 Script executed:

# Check the full file to see structure echo "=== Line count and structure ===" wc -l ".github/workflows/integration-tests.yaml" echo "" echo "=== Section headers (jobs, needs, etc) ===" grep -n "^jobs:\|^ [a-z-]*:\|^ if:\|^ needs:" ".github/workflows/integration-tests.yaml" | head -50

Repository: akash-network/provider

Length of output: 206

🏁 Script executed:

# Get the lines where e2e_target is used echo "=== Where e2e_target is used in integration-tests.yaml ===" rg -n "e2e_target" ".github/workflows/integration-tests.yaml" -B 2 -A 2

Repository: akash-network/provider

Length of output: 481

🏁 Script executed:

# Check if kind_name input exists in integration-tests.yaml echo "=== Check inputs section of integration-tests.yaml ===" sed -n '5,15p' ".github/workflows/integration-tests.yaml" echo "" echo "=== Search for any kind_name or cluster isolation in integration-tests.yaml ===" rg -i "kind.name|cluster.name" ".github/workflows/integration-tests.yaml" -B 3 -A 3

Repository: akash-network/provider

Length of output: 934

🏁 Script executed:

# Check if there's any unique naming strategy or runner isolation config echo "=== Check for runner labels or self-hosted config ===" rg "runs-on:|self-hosted" ".github/workflows/" -B 2 -A 2 | head -60

Repository: akash-network/provider

Length of output: 2545

🏁 Script executed:

# Check if github.run_id or other unique vars are used anywhere for isolation echo "=== Check for unique run identifiers used in Kind setup ===" rg "github\.(run_id|run_number|job)" ".github/workflows/integration-tests.yaml" -B 2 -A 2

Repository: akash-network/provider

Length of output: 125

🏁 Script executed:

# Final confirmation: kind_name does not exist as input echo "=== All inputs in integration-tests.yaml ===" sed -n '5,13p' ".github/workflows/integration-tests.yaml"

Repository: akash-network/provider

Length of output: 296

Parallel batches collide on hardcoded Kind cluster name.

Both batch jobs invoke crd-e2e from the same reusable workflow in parallel, and KIND_NAME: kube is hardcoded (.github/workflows/integration-tests.yaml, line 24). On core-e2e self-hosted runners sharing a Docker daemon, concurrent jobs will conflict on the same cluster name, causing flaky test failures.

Add a kind_name input to the reusable workflow and pass unique values per batch (e.g., kube-batch1, kube-batch2) to isolate cluster namespaces:

Suggested mitigation

In .github/workflows/integration-tests.yaml, add to inputs:

kind_name: description: 'Kind cluster name' required: false default: 'kube' type: string

Then update the crd-e2e job to use it instead of the hardcoded value, and in .github/workflows/tests.yaml, pass unique names:

integration-tests-batch1: uses: ./.github/workflows/integration-tests.yaml with: e2e_target: test-e2e-integration-batch1 kind_name: kube-batch1 integration-tests-batch2: uses: ./.github/workflows/integration-tests.yaml with: e2e_target: test-e2e-integration-batch2 kind_name: kube-batch2

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In @.github/workflows/tests.yaml around lines 107 - 116, The reusable workflow hardcodes KIND_NAME causing parallel batch collision; add a new input named kind_name to the reusable workflow (default "kube") and update the crd-e2e job to use this input instead of the hardcoded KIND_NAME, then pass unique values from the callers (integration-tests-batch1 and integration-tests-batch2) via the with: block (e.g., kind_name: kube-batch1 and kind_name: kube-batch2) so each batch uses a distinct Kind cluster name.

olebel · 2026-04-16T13:43:55Z

At this point, longest job ( Integration tests batch ) composed of 2 parts:

~12 minutes setting up environment, with 8min to configure Kind cluster
~10 minutes to run actual tests

k8s configuration breakdown:

Phase	Duration	%
Go binary compilation (goreleaser, 4 variants)	~3m 27s	40%
Docker image building (4 images)	~2m 40s	31%
Ingress-nginx rollout wait	~57s	11%
goreleaser apt-get inside container	~40s	8%
wasmvm downloads	~29s	6%
Node image pull (`ghcr.io/akash-network/node`)	~9s	1%
CRDs, MetalLB, kustomize applies	~7s	1%

Total: 8m 34s. 71% of the time is spent on Go binary compilation and Docker image building, with the remaining 29% split across cluster setup, dependency downloads, and rollout waits.

olebel · 2026-04-16T15:17:06Z

Other notes:

some build jobs executes as root. It produces files that's cant be ovewritten as runner user, so we keeps cleaning-up workspace each time, eliminating potential cache on self-hosted runners.
build job keeps downloading artifacts along the way - go cache, go dependencies, wasmvm libs. This ads 10-60 seconds whenever appears and significantly contributes to total build time at this point
jobs building provider binaries/docker images independently. Moving it to parent job with caching will not save time as it runs in parallel. Still doesn't feel right.
dozens of make files and it's cross-usage in CI makes it hard to optimize build

ci: run integration tests in 2 batches

22e45dd

olebel requested a review from a team as a code owner April 16, 2026 03:48

coderabbitai Bot reviewed Apr 16, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ci: run integration tests in 2 batches#391

ci: run integration tests in 2 batches#391
olebel wants to merge 1 commit into
mainfrom
ob/test

olebel commented Apr 16, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Apr 16, 2026 •

edited

Loading

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Apr 16, 2026

Uh oh!

coderabbitai Bot Apr 16, 2026

Uh oh!

olebel commented Apr 16, 2026

Uh oh!

olebel commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

olebel commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Expected impact

Uh oh!

coderabbitai Bot commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

olebel commented Apr 16, 2026

Uh oh!

olebel commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

olebel commented Apr 16, 2026 •

edited

Loading

coderabbitai Bot commented Apr 16, 2026 •

edited

Loading