feat: support nightly spec tests by brech1 · Pull Request #9221 · ChainSafe/lodestar

brech1 · 2026-04-15T23:55:40Z

Motivation

The ethereum/consensus-specs CI generates spec test artifacts daily via tests.yml, without requiring a release. This also works for forks and feature branches.

Description

Extends pnpm download-spec-tests with optional arguments to download from workflow artifacts:

pnpm download-spec-tests                                          # old behavior
pnpm download-spec-tests latest                                   # latest available run
pnpm download-spec-tests 2026-04-14                               # latest run from that date
pnpm download-spec-tests latest brech1/consensus-specs            # fork
pnpm download-spec-tests latest brech1/consensus-specs feat/x     # fork + branch

Since these are not releases, the artifacts must be downloaded through the GitHub API, this requires a GITHUB_TOKEN, it can be set in the local .env file.

Checks

I checked locally with an experimental branch:

https://github.com/brech1/consensus-specs/actions/runs/24481515544/

pnpm download-spec-tests latest brech1/consensus-specs test/payload-att-message

gemini-code-assist

Code Review

This pull request introduces the ability to download nightly spec tests from GitHub Actions artifacts, adding a new utility downloadNightlyTests and updating the beacon node test suite to support this via CLI arguments. The review feedback identifies a critical bug where the GitHub Artifacts API returns ZIP files while the extraction logic expects TAR format. Additionally, it is recommended to await asynchronous download operations to ensure predictable execution and to use URLSearchParams for more robust API query construction.

gemini-code-assist · 2026-04-15T23:57:51Z

+  for (const test of opts.testsToDownload) {
+    const artifact = artifacts.find((a) => a.name === `${test}.tar.gz` && !a.expired);
+    if (artifact) {
+      urlByTest[test] = artifact.archive_download_url;


The archive_download_url from the GitHub Artifacts API returns a ZIP archive containing the artifact, not the raw file itself. However, downloadGenericSpecTests (specifically at line 98 of downloadTests.ts) uses tar -xzf to extract the downloaded file. This will fail because tar (especially GNU tar used in Linux environments) cannot handle the ZIP format. You will need to unzip the archive first to retrieve the .tar.gz file or update the extraction logic to support ZIP archives.

This is incorrect. The archive: false line on the upload-artifacts job on the tests workflow submits the raw archive (already .tar.gz) to prevent double compression.

https://github.com/ethereum/consensus-specs/blob/ab1f6a9040a6a1e74e1a31b6151de99d12dc51df/.github/workflows/tests.yml#L194

gemini-code-assist · 2026-04-15T23:57:51Z

+  downloadTests(blsSpecTests, console.log).catch(onError);
+  downloadNightlyTests(opts, console.log, nightlyArg).catch(onError);


These asynchronous calls are not awaited, which means the script will continue execution and potentially finish the main block while the downloads are still in progress. While the Node.js process will stay alive until the promises settle, it is better practice to explicitly await them to ensure predictable execution and proper error handling.

await Promise.all([ downloadTests(blsSpecTests, console.log), downloadNightlyTests(opts, console.log, nightlyArg), ]).catch(onError);

Implemented

gemini-code-assist · 2026-04-15T23:57:51Z

+  const query = "status=success&per_page=1" + (branch ? `&branch=${branch}` : "") + (date ? `&created=${date}` : "");
+  const {workflow_runs} = await ghApiFetch<WorkflowRunsResponse>(
+    `/repos/${repo}/actions/workflows/tests.yml/runs?${query}`,
+    token
+  );


Using string concatenation to build query parameters can lead to incorrect URLs if variables like branch contain special characters (e.g., a forward slash in a branch name like feat/x). Using URLSearchParams ensures that all parameters are correctly encoded.

Suggested change

const query = "status=success&per_page=1" + (branch ? `&branch=${branch}` : "") + (date ? `&created=${date}` : "");

const {workflow_runs} = await ghApiFetch<WorkflowRunsResponse>(

`/repos/${repo}/actions/workflows/tests.yml/runs?${query}`,

token

);

const params = new URLSearchParams({status: "success", per_page: "1"});

if (branch) params.append("branch", branch);

if (date) params.append("created", date);

const {workflow_runs} = await ghApiFetch<WorkflowRunsResponse>(

`/repos/${repo}/actions/workflows/tests.yml/runs?${params.toString()}`,

token

);

Implemented

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5780392587

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-04-16T00:03:36Z

+      log(`Skipping ${test} (not found in run ${runId})`);
+    }
+  }
+
+  if (available.length === 0) throw new Error(`No matching artifacts found in run ${runId}`);


Fail when expected nightly artifacts are missing

This code only errors when zero artifacts are found, but it silently proceeds when a subset is missing (for example, a workflow run that did not upload mainnet.tar.gz). In that case downloadGenericSpecTests still writes version.txt for nightly-<runId>, so subsequent runs treat the incomplete dataset as fully cached and do not re-download the missing presets. That leaves spec test directories partially populated and can cause downstream spec runs to fail in non-obvious ways.

Useful? React with 👍 / 👎.

This is expected behavior

nflaig · 2026-04-23T20:19:30Z

@lodekeeper can you please review this PR and see why CI is failing, also test this workflow on your lodekeeper/lodestar fork and see if it works as expected and if we are passing the nightly spec tests

lodekeeper · 2026-04-23T21:52:56Z

Ran through this on my lodekeeper/lodestar fork — CI failure is fully explained by an outdated lockfile, and the nightly-download path works end-to-end. There is one downstream surprise from the spec master that's worth capturing as a follow-up, but it's not a PR-9221 regression.

1. CI failure on #9221 — outdated `pnpm-lock.yaml`

Every failing job dies in the same spot:

ERR_PNPM_OUTDATED_LOCKFILE  Cannot install with "frozen-lockfile" because pnpm-lock.yaml is not up to date with <ROOT>/packages/beacon-node/package.json
Failure reason: specifiers in the lockfile don't match specifiers in package.json:
* 1 dependencies were added: dotenv@^16.4.5

dotenv@^16.4.5 was added to packages/beacon-node/package.json (used by test/spec/downloadTests.ts) but pnpm-lock.yaml wasn't regenerated. Running pnpm install --lockfile-only adds exactly 3 lines. After that, pnpm build + local type checks succeed.

Staged fix branch on my fork: lodekeeper#20 (single commit b4c68811). Happy to cherry-pick onto this PR, or just pnpm install && git add pnpm-lock.yaml && git commit locally — either works.

2. Nightly-download path works end-to-end

Ran pnpm download-spec-tests latest locally against ethereum/consensus-specs on this PR branch. It:

hit /repos/ethereum/consensus-specs/actions/workflows/tests.yml/runs?status=success&per_page=1, resolved to run 24809971298,
fetched general.tar.gz (169 MB), mainnet.tar.gz (877 MB), minimal.tar.gz (515 MB) from archive_download_url,
extracted all three cleanly into spec-tests/tests/{general,mainnet,minimal}/.

On Gemini's concern about archive_download_url returning a ZIP: brech1's reply is right in practice — with archive: false on the upload-artifact step on ethereum/consensus-specs side, the raw .tar.gz comes through and tar -xzf extracts fine. Verified empirically.

3. Nightly spec tests pass — except one new pre-existing gap

Ran against the nightly-downloaded tests on this branch:

pnpm test:spec:general — 2437 / 2437 passing.
pnpm test:spec:minimal — 55355 passing, 95 skipped, 1 failed.

The single failure is Error: Unknown sanity test epoch_boundary in test/spec/presets/sanity.test.ts:28. Nightly ships minimal/gloas/sanity/epoch_boundary (and heze/sanity/epoch_boundary), which come from consensus-specs #5127 "Add epoch boundary tests for gloas" on master. Lodestar's sanity runner switches on testName and only knows slots / blocks, so the new category throws.

This is exactly the kind of drift the nightly-tests feature is designed to surface — not a regression from this PR. A tiny case "epoch_boundary": addition (presumably a thin wrapper around the existing sanityBlocks flow, since epoch-boundary cases look like a constrained variant of blocks) would close it out.

Summary

Unblock feat: support nightly spec tests #9221 with the lockfile update (or pull my fork-PR-20 commit).
Nightly download + spec-test execution works as designed on the fork.
Filing/fixing the epoch_boundary runner case is an orthogonal follow-up.

nflaig · 2026-04-23T21:57:11Z

@lodekeeper can you please open a PR against this one and fix everything you found, so we don't have to bother @brech1 with that stuff?

Upstream consensus-specs ChainSafe#5127 adds epoch_boundary sanity tests for Gloas (and post-Gloas `heze`). The case format is identical to blocks (pre / post / blocks_N / meta.blocks_count), so route it through the existing sanityBlocks runner. Surfaced by pnpm test:spec:minimal against nightly artifacts downloaded via the new nightly workflow in ChainSafe#9221. 🤖 Generated with AI assistance

lodekeeper · 2026-04-23T22:20:21Z

@nflaig — opened brech1#1 against this branch with the two fixes:

pnpm-lock.yaml regenerated for dotenv@^16.4.5 (fixes the Build ERR_PNPM_OUTDATED_LOCKFILE).
test/spec/presets/sanity.test.ts — dispatch epoch_boundary through the existing sanityBlocks runner (fixture format matches blocks; new suite from consensus-specs#5127).

After both fixes applied locally: pnpm test:spec:minimal → 56128 passed, 5 failed, 95 skipped. The 5 failures are gloas/sanity/epoch_boundary/pyspec_tests/epoch_boundary_full_parent_* (+ switch_to_compounding_across_epoch_boundary) — all hitting Block parent root ... does not match state latest block at processBlockHeader.ts:34. These are pre-existing Lodestar gaps surfaced by the new gloas test fixtures, not regressions from this PR. I'll investigate those separately once the nightly workflow is in.

nflaig · 2026-04-23T23:47:39Z

After both fixes applied locally: pnpm test:spec:minimal → 56128 passed, 5 failed, 95 skipped. The 5 failures are gloas/sanity/epoch_boundary/pyspec_tests/epoch_boundary_full_parent_* (+ switch_to_compounding_across_epoch_boundary) — all hitting Block parent root ... does not match state latest block at processBlockHeader.ts:34. These are pre-existing Lodestar gaps surfaced by the new gloas test fixtures, not regressions from this PR. I'll investigate those separately once the nightly workflow is in.

@lodekeeper what do you mean pre-existing gaps in lodestar? are you sure, maybe the tests are faulty? please investigate further

lodekeeper · 2026-04-23T23:56:14Z

@nflaig — you were right to push back. I was wrong to call this a Lodestar gap. After tracing the failures, it's a test-generator bug in the new test_epoch_boundary.py in consensus-specs#5127.

Mechanism, concretely:

All 5 failing tests call set_parent_block_full(spec, state) after yield "pre", state — i.e. between signed_block_1 = state_transition_and_sign_block(...) and block_2 = _build_....
set_parent_block_full mutates state.latest_block_hash (and state.latest_execution_payload_bid.block_hash), both of which are part of hash_tree_root(state).
The very next process_slot snapshots hash_tree_root(state) into state.latest_block_header.state_root, so the mutation flows into latest_block_header.
Python computes block_2.parent_root = hash_tree_root(state.latest_block_header) using the mutated state, and serializes block_2 that way.
The fixture only serializes pre / post / blocks_0 / blocks_1 / meta.blocks_count. There is no fixture entry to carry the between-blocks mutation, so every consumer (not just Lodestar) replaying the test will see a different state.latest_block_header.state_root after block_1 and fail processBlockHeader's parent-root check on block_2.
All 3 passing test_epoch_boundary_empty_parent_* call set_parent_block_full before yield "pre", so the mutation is captured in the pre-state — that's why they pass.

Fix belongs in consensus-specs, one of:

a) Extend the fixture format with execution_payload_envelopes_N.ssz_snappy and teach the sanity runner to deliver the envelope between blocks (bigger, but closer to on-wire semantics).
b) Restructure the affected tests so the FULL-parent path is exercised by actually running spec.process_execution_payload in the generator, i.e. remove the out-of-band set_parent_block_full helper. That's the smaller fix and stays within the existing fixture format.

I can open the consensus-specs PR — leaning (b) unless you'd rather keep the helper. brech1#1 stays scoped to the two review findings (lockfile + dispatcher); those 5 failures are not the nightly workflow's problem, and they'll flip green once the upstream test PR lands.

nflaig · 2026-04-24T07:07:51Z

    case "slots":
      return sanitySlots(fork, testName, testSuite);
    case "blocks":
+    case "epoch_boundary":


need to see, maybe these need to be moved under blocks on the spec side but this works for now

Agreed — if consensus-specs ends up categorizing these under blocks/pyspec_tests/ (either by renaming the file or by routing epoch_boundary through the existing blocks handler in the generator), the Lodestar side stays a no-op: the case "epoch_boundary": fallthrough is harmless if the directory disappears, and can be dropped later. Either way this doesn't block #9221.

nflaig

Looks great, thanks for adding this @brech1

codecov · 2026-04-24T08:38:08Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 52.52%. Comparing base (78c66ba) to head (a69ecdf).
⚠️ Report is 2 commits behind head on unstable.

Additional details and impacted files

@@            Coverage Diff            @@
##           unstable    #9221   +/-   ##
=========================================
  Coverage     52.52%   52.52%           
=========================================
  Files           848      848           
  Lines         61304    61304           
  Branches       4510     4510           
=========================================
  Hits          32201    32201           
  Misses        29038    29038           
  Partials         65       65

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Follow up to #9221 which adds a worfklow to run nightly spec tests

brech1 requested a review from a team as a code owner April 15, 2026 23:55

gemini-code-assist Bot reviewed Apr 15, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed Apr 16, 2026

View reviewed changes

brech1 force-pushed the feat/nightly-workflow branch from 78c5fac to 75f7a64 Compare April 16, 2026 13:14

brech1 added 2 commits April 17, 2026 01:31

feat: nightly tests

c0c0782

fix: review requests

c2f06a1

brech1 force-pushed the feat/nightly-workflow branch from 75f7a64 to c2f06a1 Compare April 17, 2026 04:31

Merge branch 'unstable' into feat/nightly-workflow

5a32bd3

lodekeeper mentioned this pull request Apr 23, 2026

chore: update pnpm-lock.yaml for dotenv devDependency (9221 CI fix) lodekeeper/lodestar#20

Closed

lodekeeper mentioned this pull request Apr 23, 2026

fix: update pnpm lockfile and dispatch epoch_boundary sanity runner brech1/lodestar#1

Closed

nflaig added 2 commits April 24, 2026 07:28

Merge branch 'unstable' into feat/nightly-workflow

a80e980

review

804c3fd

nflaig reviewed Apr 24, 2026

View reviewed changes

lint

a69ecdf

nflaig approved these changes Apr 24, 2026

View reviewed changes

nflaig merged commit d166e3b into ChainSafe:unstable Apr 24, 2026
19 checks passed

nflaig mentioned this pull request Apr 24, 2026

chore: add workflow to run nightly spec tests #9268

Merged

nflaig added a commit that referenced this pull request Apr 24, 2026

chore: add workflow to run nightly spec tests (#9268)

a140dd9

Follow up to #9221 which adds a worfklow to run nightly spec tests

brech1 deleted the feat/nightly-workflow branch April 24, 2026 19:52

		downloadTests(blsSpecTests, console.log).catch(onError);
		downloadNightlyTests(opts, console.log, nightlyArg).catch(onError);

Uh oh!

Conversation

brech1 commented Apr 15, 2026

Motivation

Description

Checks

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

brech1 Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

brech1 Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

brech1 Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

brech1 Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

nflaig commented Apr 23, 2026

Uh oh!

lodekeeper commented Apr 23, 2026

1. CI failure on #9221 — outdated pnpm-lock.yaml

2. Nightly-download path works end-to-end

3. Nightly spec tests pass — except one new pre-existing gap

Summary

Uh oh!

nflaig commented Apr 23, 2026

Uh oh!

lodekeeper commented Apr 23, 2026

Uh oh!

nflaig commented Apr 23, 2026

Uh oh!

lodekeeper commented Apr 23, 2026

Uh oh!

nflaig Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

lodekeeper Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

nflaig left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

codecov Bot commented Apr 24, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

1. CI failure on #9221 — outdated `pnpm-lock.yaml`