feat(gnoland): chain hardfork mechanism#5411
Closed
moul wants to merge 14 commits into
Closed
Conversation
Collaborator
🛠 PR Checks SummaryAll Automated Checks passed. ✅ Manual Checks (for Reviewers):
Read More🤖 This bot helps streamline PR reviews by verifying automated checks and providing guidance for contributors and reviewers. ✅ Automated Checks (for Contributors):No automated checks match this pull request. ☑️ Contributor Actions:
☑️ Reviewer Actions:
📚 Resources:Debug
|
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
5 tasks
moul
added a commit
that referenced
this pull request
Apr 7, 2026
…es (#5334) ## Summary Adds a halt height mechanism for coordinated chain upgrades. The node stops after committing the specified block height. ### How to set it ```bash gnoland config set halt_height 352922 ``` Or edit `config.toml` directly: ```toml halt_height = 352922 ``` ### How it works 1. After `finalizeCommit`, consensus checks if `height >= halt_height` 2. If so, calls `osm.Kill()` for a graceful shutdown 3. The check is at the consensus level (not ABCI), following the same pattern as `WithEarlyStart` ### Scope and future direction This is a **temporary coordination tool** for the current chain upgrade. For the gnoland1 → gnoland-1 hard fork, validators set `halt_height` in their config, all nodes stop at the same block, then validators swap binary + config and restart. After the upgrade, the proper mechanism will be **GovDAO-based halting** (#5368), which adds: - On-chain `halt_height` param set via governance proposal (no manual config needed) - `halt_min_version` — prevents old binaries from restarting after halt - Version guard at startup so validators can't accidentally run the wrong binary Once #5368 is merged and active, `halt_height` in config becomes a **node operator tool** (e.g., "stop my node at height X for maintenance") rather than a coordination mechanism. Coordination should happen through governance. ### No CLI flag — config only Per @tbruyelle's suggestion, there's no `--halt-height` CLI flag. Config file is the single source of truth. This avoids the risk of validators missing the flag in duplicated command setups across their infrastructure. ### Related - #5368 — GovDAO-based halt height + version guard (Phase 2, replaces this for coordination) - #5376 — gnoland-1 chain config - #5411 — chain upgrade genesis replay <details> <summary>Contributors' checklist</summary> - [x] Added new tests, or not needed, or not feasible - [x] Provided an example (e.g. screenshot) to aid review or the PR is self-explanatory - [x] Updated the official documentation or not needed - [x] No breaking changes were made, or a `BREAKING CHANGE: xxx` message was included in the description - [x] Added `benchmarks` label to the PR or not needed </details>
moul
added a commit
that referenced
this pull request
Apr 7, 2026
…es (#5334) ## Summary Adds a halt height mechanism for coordinated chain upgrades. The node stops after committing the specified block height. ### How to set it ```bash gnoland config set halt_height 352922 ``` Or edit `config.toml` directly: ```toml halt_height = 352922 ``` ### How it works 1. After `finalizeCommit`, consensus checks if `height >= halt_height` 2. If so, calls `osm.Kill()` for a graceful shutdown 3. The check is at the consensus level (not ABCI), following the same pattern as `WithEarlyStart` ### Scope and future direction This is a **temporary coordination tool** for the current chain upgrade. For the gnoland1 → gnoland-1 hard fork, validators set `halt_height` in their config, all nodes stop at the same block, then validators swap binary + config and restart. After the upgrade, the proper mechanism will be **GovDAO-based halting** (#5368), which adds: - On-chain `halt_height` param set via governance proposal (no manual config needed) - `halt_min_version` — prevents old binaries from restarting after halt - Version guard at startup so validators can't accidentally run the wrong binary Once #5368 is merged and active, `halt_height` in config becomes a **node operator tool** (e.g., "stop my node at height X for maintenance") rather than a coordination mechanism. Coordination should happen through governance. ### No CLI flag — config only Per @tbruyelle's suggestion, there's no `--halt-height` CLI flag. Config file is the single source of truth. This avoids the risk of validators missing the flag in duplicated command setups across their infrastructure. ### Related - #5368 — GovDAO-based halt height + version guard (Phase 2, replaces this for coordination) - #5376 — gnoland-1 chain config - #5411 — chain upgrade genesis replay <details> <summary>Contributors' checklist</summary> - [x] Added new tests, or not needed, or not feasible - [x] Provided an example (e.g. screenshot) to aid review or the PR is self-explanatory - [x] Updated the official documentation or not needed - [x] No breaking changes were made, or a `BREAKING CHANGE: xxx` message was included in the description - [x] Added `benchmarks` label to the PR or not needed </details>
aeddi
pushed a commit
that referenced
this pull request
Apr 9, 2026
…es (#5334) ## Summary Adds a halt height mechanism for coordinated chain upgrades. The node stops after committing the specified block height. ### How to set it ```bash gnoland config set halt_height 352922 ``` Or edit `config.toml` directly: ```toml halt_height = 352922 ``` ### How it works 1. After `finalizeCommit`, consensus checks if `height >= halt_height` 2. If so, calls `osm.Kill()` for a graceful shutdown 3. The check is at the consensus level (not ABCI), following the same pattern as `WithEarlyStart` ### Scope and future direction This is a **temporary coordination tool** for the current chain upgrade. For the gnoland1 → gnoland-1 hard fork, validators set `halt_height` in their config, all nodes stop at the same block, then validators swap binary + config and restart. After the upgrade, the proper mechanism will be **GovDAO-based halting** (#5368), which adds: - On-chain `halt_height` param set via governance proposal (no manual config needed) - `halt_min_version` — prevents old binaries from restarting after halt - Version guard at startup so validators can't accidentally run the wrong binary Once #5368 is merged and active, `halt_height` in config becomes a **node operator tool** (e.g., "stop my node at height X for maintenance") rather than a coordination mechanism. Coordination should happen through governance. ### No CLI flag — config only Per @tbruyelle's suggestion, there's no `--halt-height` CLI flag. Config file is the single source of truth. This avoids the risk of validators missing the flag in duplicated command setups across their infrastructure. ### Related - #5368 — GovDAO-based halt height + version guard (Phase 2, replaces this for coordination) - #5376 — gnoland-1 chain config - #5411 — chain upgrade genesis replay <details> <summary>Contributors' checklist</summary> - [x] Added new tests, or not needed, or not feasible - [x] Provided an example (e.g. screenshot) to aid review or the PR is self-explanatory - [x] Updated the official documentation or not needed - [x] No breaking changes were made, or a `BREAKING CHANGE: xxx` message was included in the description - [x] Added `benchmarks` label to the PR or not needed </details>
moul
added a commit
that referenced
this pull request
Apr 9, 2026
- Revert premature doc references to gnoland-1 chain ID in gas-fees.md and explore-with-gnoweb.md (hardfork hasn't happened yet) - Remove premature "Note" callout from gnoland-networks.md - Update migrate-from-gnoland1.sh: reflect Scenario A decision (genesis tx-replay with InitialHeight), document blockers (#5411, #5390, Jae's InitialHeight tm2 work), reference issue #5374 for tracking - Update gnoland-1/README.md: reflect correct PR merge status, document Scenario A approach, list migration blockers explicitly
4 tasks
- Add ChainID field to GnoTxMetadata for tx provenance recording - Add InitialHeight validation (non-negative) to GenesisDoc.Validate and ValidateAndComplete - Add test cases: no chain ID override when BlockHeight=0, no override when OriginalChainID unset - Update ADR: document per-tx vs state-level design choice, mark InitialHeight as implemented end-to-end
…lay and ValidateAndComplete
…is.sh misc/hardfork/ — new Go binary with three source modes: - RPC: iterates blocks from a live/halted node, extracts txs with metadata - local dir: reads genesis.json + txs.jsonl from a stopped node data dir - genesis file: single .json source (no tx history) Produces a hardfork genesis with: - chain_id updated to new chain - initial_height set to halt_height + 1 (both at GenesisDoc level and app_state) - original_chain_id set for historical tx signature verification - historical txs appended with BlockHeight/Timestamp/ChainID metadata misc/deployments/gnoland-1/generate-genesis.sh is now a thin wrapper around hardfork genesis with gnoland-1 specific defaults (chain IDs, overlay directory).
…o worktree-hf-framework-5411
3 tasks
Closed
21 tasks
…ay smoke-test Adds a new 'hardfork test' subcommand that loads a hardfork genesis.json into an in-memory gnoland node and replays all transactions in-process. Key behaviors: - Generates a fresh single-validator identity (replaces genesis validators) so the node can produce blocks without requiring real validator keys - SkipGenesisSigVerification enabled for genesis-mode txs - Historical txs (block_height > 0) go through the normal ante handler using original_chain_id from genesis for signature verification - Progress reporting every 30s for long replays - --verbose flag logs each tx result - --keep-running flag keeps the node alive for manual RPC inspection - Exit code 0 on success, non-zero on failure Also adds: - Unit tests covering error paths and a full empty-genesis replay - Makefile 'preview-and-test' target for quick local smoke tests - Updated 'hardfork genesis' next-steps to reference 'hardfork test'
moul
added a commit
to moul/gno
that referenced
this pull request
Apr 13, 2026
moul
added a commit
to moul/gno
that referenced
this pull request
Apr 13, 2026
Adds misc/hf-glue/: a throwaway testbed that chains the tools from gnolang#5411 and gnolang#5376 to run a local, single-validator hardforked chain in docker, with state persisted on disk. Flow: make fetch # hardfork genesis --source rpc.gno.land -> out/genesis.json make init # gnoland secrets init + rewrite validator set to our key make up # docker compose up: single-validator gnoland node, RPC :26657 This exists only to find gaps in gnolang#5411/gnolang#5376 end-to-end. Do NOT merge. Fixes go back upstream.
moul
added a commit
to moul/gno
that referenced
this pull request
Apr 13, 2026
…t fixes Workarounds real gnolang#5411 bug: gnoland RPC /genesis endpoint 502s on gnoland1-scale genesis (~80MB, server closes stream mid-transfer) and /genesis_chunked is not implemented. Until that's fixed upstream, rebuild the base genesis locally via misc/deployments/gnoland1/gen-genesis.sh and feed the file to 'hardfork genesis --source <file>'. Also fixes two init-node.sh bugs: - gnoland secrets init --data-dir expects the secrets dir directly, not the node home dir - fixvalidator needs gno.land/pkg/gnoland blank-imported for the GnoGenesisState amino registration
This was referenced Apr 13, 2026
moul
added a commit
to moul/gno
that referenced
this pull request
Apr 16, 2026
Adds misc/hf-glue/: a throwaway testbed that chains the tools from gnolang#5411 and gnolang#5376 to run a local, single-validator hardforked chain in docker, with state persisted on disk. Flow: make fetch # hardfork genesis --source rpc.gno.land -> out/genesis.json make init # gnoland secrets init + rewrite validator set to our key make up # docker compose up: single-validator gnoland node, RPC :26657 This exists only to find gaps in gnolang#5411/gnolang#5376 end-to-end. Do NOT merge. Fixes go back upstream.
moul
added a commit
to moul/gno
that referenced
this pull request
Apr 16, 2026
…t fixes Workarounds real gnolang#5411 bug: gnoland RPC /genesis endpoint 502s on gnoland1-scale genesis (~80MB, server closes stream mid-transfer) and /genesis_chunked is not implemented. Until that's fixed upstream, rebuild the base genesis locally via misc/deployments/gnoland1/gen-genesis.sh and feed the file to 'hardfork genesis --source <file>'. Also fixes two init-node.sh bugs: - gnoland secrets init --data-dir expects the secrets dir directly, not the node home dir - fixvalidator needs gno.land/pkg/gnoland blank-imported for the GnoGenesisState amino registration
jaekwon
pushed a commit
that referenced
this pull request
May 7, 2026
## Overview Chain hardfork mechanism for gno.land: export all state and historical transactions from the source chain, replay them during `InitChain` on the new chain, and start producing blocks at the halted height. Replaces the original single-`OriginalChainID` design from [#5411](#5411) with a more flexible multi-chain model (`PastChainIDs` allowlist + per-tx `ChainID`). **History:** - Original work: [#5411](#5411) - Jae's refinements: [feat/genesis-replay-upgrade2](https://github.com/gnolang/gno/tree/feat/genesis-replay-upgrade2) - This PR: builds on top of Jae's work, adds fixes from extensive review + end-to-end validation on the full gnoland1 chain via the [hf-glue testbed](#5486) ## What's in ### tm2 (consensus + SDK) - **`GenesisDoc.InitialHeight`** consensus starts block production at this height after `InitChain`; `Handshaker` sets `state.LastBlockHeight = InitialHeight - 1`. - **`BlockchainReactor`, `state`, `store`, validation** all updated to handle chains where `InitialHeight > 1` (empty block store, non-contiguous block save, validator set / consensus params persisted at InitialHeight, etc.) - **`BaseApp.lastBlockHeight` tracker** (this iteration): real chain height = `multistoreVersion + initialHeightOffset`, with the offset persisted under `mainInitialHeightKey` and restored on every restart. `validateHeight` now enforces strict contiguity against real chain height; the previous "allow monotonic jump" branch (which permanently bypassed contiguity for `InitialHeight > 1` chains) is gone. - **`BaseApp.Info` guard** handle calls before the multistore is loaded. - **`auth.SkipGasMeteringKey`** context flag that lets `SetGasMeter` bypass the new VM's gas meter (used for `GasReplayMode="source"`). - **`RequestInitChain.InitialHeight`** new ABCI field so the app can cross-check against `GnoGenesisState.InitialHeight`. Amino round-trip test added. ### gno.land - **`GnoGenesisState`** extensions: - `PastChainIDs []string` allowlist of past chain IDs valid for signature verification - `InitialHeight int64` cross-checked against `GenesisDoc.InitialHeight` - `GasReplayMode string` `""`/`"strict"` (default, new VM's gas meter) or `"source"` (bypass gas meter, preserve source-chain outcomes) - **`GnoTxMetadata`** extensions: - `BlockHeight int64` original block height - `ChainID string` originating chain ID - `Failed bool` tx had non-zero return code on source chain (skipped during replay) - `SignerInfo []SignerAccountInfo` per-signer account metadata (address, account number, pre-tx sequence) so signatures verify correctly even if earlier txs diverged - `GasUsed`, `GasWanted int64` source-chain gas (populated by tx-archive, used by replay report) - **`auth.NewAccountWithUncheckedNumber`** (this iteration, renamed from `NewAccountWithNumber`): create accounts with a specific number, bypassing the auto-increment counter. Doc comment now spells out the precondition that the caller must enforce uniqueness; the rename forces every call site to acknowledge it. - **`validateSignerInfo` preflight** (this iteration): scans every `SignerInfo` entry across all txs at the start of `loadAppState`. Rejects the genesis if two different addresses claim the same account number, or if a `SignerInfo` claims a number reserved by a balance-init account at a different address. Defense-in-depth against a malformed genesis silently corrupting state. - **`InitChainerConfig.StrictReplay`** (this iteration): opt-in fail-closed boot. Defaults to `false` for backwards compat. Hardfork operators set it to `true` so any non-skipped tx replay failure aborts `InitChain` instead of letting the chain boot in a corrupted state. Skipped txs (`metadata.Failed = true`) do not count. - **Genesis-mode tx sig verify with PastChainIDs[0]** genesis-mode txs (no metadata or `BlockHeight == 0`) use the first `PastChainIDs` entry for sig verify when a hardfork is in progress (PR #5540). The genesis-mode chain-ID branch is now gated on `metadata == nil` (this iteration) so migration txs (`metadata != nil`, `BlockHeight == 0`, `Timestamp != 0`) keep their metadata-driven `ctxFn` instead of being silently overwritten. - **`BaseApp.InitChain` error surfacing** (this iteration): when `InitChainer` returns `ResponseInitChain.Error`, return cleanly instead of falling through to the validators-count sanity check, which would otherwise panic with a misleading `"validators count mismatch"` and mask the real cause. - **Replay report** per-tx categorization emitted via logger after `InitChain`: `ok` / `ok_gas_differs` / `failed` / `skipped_failed`. Exposes `Outcomes()` and `FailedCount()` for external tooling. ### Hardfork tooling (`contribs/gnogenesis/internal/fork/`) - **`gnogenesis fork generate`** generate a hardfork genesis from a source chain (RPC URL, local data dir, or exported tarball). - **`gnogenesis fork test`** local genesis replay smoke-test. - **`--patch-realm PKGPATH=SRCDIR`** (repeatable) rewrite a genesis-mode `addpkg` tx in-place with files from `SRCDIR`. Lets you deliver realm upgrades as part of the fork (e.g. adding a new `.gno` file to an existing realm) since you cannot re-addpkg post-deploy (PR #5540). - **`--migration-tx`** inject a single migration tx at the end of the historical replay. - **`bruteForceSignerSequence`** resolve signer sequences during export by trying candidate values against the signature. ## Bugs found and fixed during review ### tm2 consensus (all fixed) 1. **Fast-sync broken with InitialHeight > 1** `BlockPool` started at `store.Height()+1 = 1` instead of `state.LastBlockHeight+1 = InitialHeight`. Nodes trying to fast-sync would request non-existent blocks. 2. **Validator set / consensus params not saved at InitialHeight** `saveState` only saved validators when `nextHeight == 1`. With InitialHeight > 1, `LoadValidators` failed and `LoadConsensusParams` panicked at block InitialHeight+1. 3. **`ValidateBasic` bypass via zeroed `LastBlockID`** any block with `LastBlockID.IsZero()` could skip commit validation. Fixed: only allow skip when commit is also nil/empty. 4. **`BaseApp.validateHeight` permanent contiguity bypass** the previous "allow monotonic jump" branch compared real block height against the multistore version. After the first commit, `actual > prevHeight` is trivially true on every subsequent block, so the contiguity check was bypassed forever (an attacker or buggy consensus engine that skipped N blocks would be silently accepted). Fixed by tracking real chain height in `lastBlockHeight` (this iteration). 5. **`BaseApp.InitChain` masking real error** when `loadAppState` returned an error response, the validators-count sanity check fired with `"validators count mismatch"` masking the actual cause. Fixed: return cleanly on error response (this iteration). ### gno.land (all fixed) 6. **`loadAppState` returns nil even on N tx failures** chain booted in a corrupted state when historical-tx replay had failures. Fixed via opt-in `StrictReplay` in `InitChainerConfig` (this iteration). 7. **Migration-tx `ctxFn` overwrite** the genesis-mode chain-ID branch fired on any `metadata.BlockHeight == 0`, stomping the metadata-driven `Timestamp` override on migration txs. Fixed: tighten predicate to `metadata == nil` and compose with any prior `ctxFn` (this iteration). 8. **`NewAccountWithNumber` had no SignerInfo collision check** two `SignerInfo` entries with the same `AccountNum` but different addresses, or a `SignerInfo` colliding with a balance-init account, would silently zero the original account's balance. Fixed: rename to `NewAccountWithUncheckedNumber` (forcing every call site to acknowledge the precondition) plus `validateSignerInfo` preflight in `loadAppState` (this iteration). 9. **Failed-tx `ResponseDeliverTx` was empty (looked like success)** explicit error marker so indexers can distinguish. 10. **`GnoGenesisState.InitialHeight` wasn't cross-checked against `GenesisDoc.InitialHeight`** added `InitialHeight` to `RequestInitChain` and validate in `loadAppState`. 11. **`RequestInitChain.InitialHeight` had no amino round-trip test** silent registration regression would only surface during a real hardfork (this iteration). ### Hardfork tooling (fixed) 12. **`applyOverlay` silent no-op** listed scripts but didn't execute them, returned success. Fixed: returns error when scripts found but execution not implemented. 13. **JSONL serialization used `encoding/json` instead of amino** interface types (`std.Msg`) lost on round-trip. Fixed: both writer and reader now use amino. 14. **`verifyGenesisFile` failure returned success** tool could produce invalid genesis and exit 0. Fixed: failure aborts (opt out with `--no-verify`). 15. **Zero unit tests for `bruteForceSignerSequence`** fixed: 10 table-driven tests. ### Docs linter (side fix for green CI) - Skip `staging.gno.land`, `archive.org`, and add retry/timeout logic so transient remote-link failures don't block unrelated PRs. ## Still open (design / follow-up) - **RPC retry/resume** (`contribs/gnogenesis/internal/fork/source_rpc.go`) a single transient error during tx fetch aborts everything; needs exponential backoff + checkpointing. Architectural, follow-up PR. - **Streaming tx export** full tx history is held in memory; will OOM on large chains. Needs streaming writer, follow-up PR. - **`queryAccountAtHeight` silent nil** all error paths return nil with no indication; flaky RPC → wrong sequence metadata. ## Cherry-picked from [#5597](#5597) (this iteration) Three follow-ups originally staged in the master-based hardfork series, brought back to where they belong since they modify or extend code introduced here: - [`1babfe42a`](1babfe42a) `fix(consensus): skip phantom heights during replay when InitialHeight > 1` — ABCI handshake replay path used to assume heights `[1, appBlockHeight+1]` always have a stored block; for chains starting at `InitialHeight > 1`, heights below `InitialHeight` never had blocks and replay errored with "block not found for height 1". - [`5bf2fa53e`](5bf2fa53e) `fix(gnogenesis): default gas-storage params and gas_replay_mode in hardfork genesis` — `buildHardforkGenesis` now defaults the post-#5415 `vm.params` gas-storage fields from `vm.DefaultParams()` when the source has them all at zero, and sets `gas_replay_mode = "source"` when unset. Operator overrides preserved. 4 unit tests. - [`e31268467`](e31268467) `feat(gnogenesis): add --skip-failing-genesis-txs and --skip-genesis-sig-verification flags to fork test` — `make smoketest` now matches what production validators actually run. ## End-to-end validation The hf-glue testbed ([#5486](#5486)) runs `make fetch && make init && make up` against `rpc.gno.land` halt@704052 and produces a 192 MB hardfork genesis that replays with **0 / 2715 tx failures** and boots a live `gnoland-1` node. ## Dependencies / related PRs - **Depends on / pairs with:** [#5533](#5533) (`contribs/tx-archive` metadata + `SignerInfo` populator) for replay-ready backups - **Used in:** [#5486](#5486) (hf-glue testbed) - **Also fixed here:** [#5539](#5539) (docs-linter skip staging preemptive fix, committed here too to keep CI green) ## AI disclosure Developed with significant assistance from Claude Code for testing, review, and iterative fixes. --------- Co-authored-by: moul <noreply@moul.io> Co-authored-by: jaekwon <jae@tendermint.com> assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: aeddi <antoine.e.b@gmail.com> merging for moul
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Edit: see also #5511
Summary
Adds support for replaying historical transactions during genesis for chain hard forks (gnoland1 → gnoland-1).
Key changes
GnoGenesisState.OriginalChainID: specifies the chain ID used for verifying signatures of historical txs during genesis replayGnoGenesisState.InitialHeight: block height the chain should start from after genesisGnoTxMetadata.BlockHeight: original block height where the tx executedGenesisDoc.InitialHeight(tm2): when > 1, the consensus Handshaker setsstate.LastBlockHeight = InitialHeight - 1after InitChain so the first produced block has the correct heightmetadata.BlockHeight > 0andOriginalChainIDis set, the context chain ID is overridden so signature verification uses the original chain IDHow it works
OriginalChainIDInitialHeight(viaGenesisDoc.InitialHeight→ consensus Handshaker →state.LastBlockHeight)Migration script
misc/deployments/gnoland-1/generate-genesis.sh— generates gnoland-1 genesis from a running gnoland1 node using tx-archive.Related PRs
--halt-heightCLI flagOpen items
authgenesis state from old chain)AI-assisted: code generated with Claude Code