feat: hardfork-replay improvements (BaseApp InitialHeight + PastChainIDs genesis-mode + --patch-realm)#5540
Merged
moul merged 3 commits intoApr 17, 2026
Conversation
Two separate issues hit by chains whose genesis sets InitialHeight > 1 (the hardfork-replay use case from gnolang#5511): 1. validateHeight compared req.Header.Height against the multistore version counter (which auto-increments from 0). With InitialHeight > 1 the counter lags the block height — the first block arrives at (e.g.) 101 while the store is at version 0, then the second block arrives at 102 while the store is at version 1, so the check "expected 2, got 102" panicked. Now: when the store version lags the block height, accept the jump as long as height is monotonic. 2. Info() returned the multistore version as LastBlockHeight. On restart the handshaker saw appHeight=1 (store version) but storeHeight=102 (real blocks) and tried to replay missing blocks. Now: when the persisted header records a higher block height, return that instead. These fixes are exercised by the hardfork-replay flow but help any chain that sets InitialHeight > 1.
During a hardfork replay, genesis-mode txs (metadata == nil or BlockHeight == 0) were originally signed with the source chain's chain-id — not the new one. Historical txs (BlockHeight > 0) already get PastChainIDs-based chain-id override in loadAppState; this extends the same treatment to genesis-mode txs by using the first PastChainIDs entry when a hardfork is in progress. In practice this still needs --skip-genesis-sig-verification for gnogenesis-produced addpkg txs (where msg.Creator ≠ the signing key — the pubkey-address check rejects them regardless of chain-id). But for genesis-mode txs where the signer IS the creator, this makes the signature verify against the correct chain-id without any skip flag. Tested end-to-end on the gnoland1 hardfork testbed in gnolang#5486.
…ork time Adds a repeatable --patch-realm PKGPATH=SRCDIR flag to \`hardfork genesis\` that rewrites the genesis-mode addpkg tx for PKGPATH in-place, replacing its Package.Files with the *.gno + gnomod.toml files from SRCDIR. The source genesis on disk stays untouched — the patch lives only in the in-memory GnoGenesisState used to assemble the output. Motivation: you cannot re-addpkg to the same path post-deploy (unauthorized), and you cannot add a new .gno file to an existing realm via a call, so the only way to land a code change on an existing realm is to rewrite the original addpkg tx that deployed it. Example (tested end-to-end in the hf-glue testbed gnolang#5486): hardfork genesis --source /path/to/source \\ --patch-realm gno.land/r/sys/params=/src/examples/gno.land/r/sys/params \\ --chain-id gnoland-1 --output genesis.json Combined with gnolang#5368 (which adds halt.gno to r/sys/params), the forked chain boots with the new GovDAO halt mechanism available: $ curl ... vm/qfile gno.land/r/sys/params → fee_collector.gno, gnomod.toml, halt.gno, params.gno, unlock.gno Multiple --patch-realm flags can be combined to land several realm upgrades in one fork.
Collaborator
🛠 PR Checks SummaryAll Automated Checks passed. ✅ Manual Checks (for Reviewers):
Read More🤖 This bot helps streamline PR reviews by verifying automated checks and providing guidance for contributors and reviewers. ✅ Automated Checks (for Contributors):No automated checks match this pull request. ☑️ Contributor Actions:
☑️ Reviewer Actions:
📚 Resources:Debug
|
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
7e99bea
into
gnolang:feat/genesis-replay-upgrade3
126 of 127 checks passed
Member
Author
|
sorry shoul dhave commited directly in the pr |
moul
added a commit
that referenced
this pull request
Apr 19, 2026
- Document GasReplayMode field and "source" mode - Document GasUsed/GasWanted metadata fields - Document auth.SkipGasMeteringKey context flag - Document replay report with categorization - Document RequestInitChain.InitialHeight cross-check (GnoGenesisState.InitialHeight is no longer "informational only") - Document hardfork tooling: --patch-realm, hardfork test - Add BaseApp.validateHeight / Info InitialHeight>1 fixes (PR #5540) - Add genesis-mode sig verify against PastChainIDs[0] (PR #5540) - Mark gas-tolerance and replay-report open items as resolved - Add docs-linter stability fix note
jaekwon
pushed a commit
that referenced
this pull request
May 7, 2026
## Overview Chain hardfork mechanism for gno.land: export all state and historical transactions from the source chain, replay them during `InitChain` on the new chain, and start producing blocks at the halted height. Replaces the original single-`OriginalChainID` design from [#5411](#5411) with a more flexible multi-chain model (`PastChainIDs` allowlist + per-tx `ChainID`). **History:** - Original work: [#5411](#5411) - Jae's refinements: [feat/genesis-replay-upgrade2](https://github.com/gnolang/gno/tree/feat/genesis-replay-upgrade2) - This PR: builds on top of Jae's work, adds fixes from extensive review + end-to-end validation on the full gnoland1 chain via the [hf-glue testbed](#5486) ## What's in ### tm2 (consensus + SDK) - **`GenesisDoc.InitialHeight`** consensus starts block production at this height after `InitChain`; `Handshaker` sets `state.LastBlockHeight = InitialHeight - 1`. - **`BlockchainReactor`, `state`, `store`, validation** all updated to handle chains where `InitialHeight > 1` (empty block store, non-contiguous block save, validator set / consensus params persisted at InitialHeight, etc.) - **`BaseApp.lastBlockHeight` tracker** (this iteration): real chain height = `multistoreVersion + initialHeightOffset`, with the offset persisted under `mainInitialHeightKey` and restored on every restart. `validateHeight` now enforces strict contiguity against real chain height; the previous "allow monotonic jump" branch (which permanently bypassed contiguity for `InitialHeight > 1` chains) is gone. - **`BaseApp.Info` guard** handle calls before the multistore is loaded. - **`auth.SkipGasMeteringKey`** context flag that lets `SetGasMeter` bypass the new VM's gas meter (used for `GasReplayMode="source"`). - **`RequestInitChain.InitialHeight`** new ABCI field so the app can cross-check against `GnoGenesisState.InitialHeight`. Amino round-trip test added. ### gno.land - **`GnoGenesisState`** extensions: - `PastChainIDs []string` allowlist of past chain IDs valid for signature verification - `InitialHeight int64` cross-checked against `GenesisDoc.InitialHeight` - `GasReplayMode string` `""`/`"strict"` (default, new VM's gas meter) or `"source"` (bypass gas meter, preserve source-chain outcomes) - **`GnoTxMetadata`** extensions: - `BlockHeight int64` original block height - `ChainID string` originating chain ID - `Failed bool` tx had non-zero return code on source chain (skipped during replay) - `SignerInfo []SignerAccountInfo` per-signer account metadata (address, account number, pre-tx sequence) so signatures verify correctly even if earlier txs diverged - `GasUsed`, `GasWanted int64` source-chain gas (populated by tx-archive, used by replay report) - **`auth.NewAccountWithUncheckedNumber`** (this iteration, renamed from `NewAccountWithNumber`): create accounts with a specific number, bypassing the auto-increment counter. Doc comment now spells out the precondition that the caller must enforce uniqueness; the rename forces every call site to acknowledge it. - **`validateSignerInfo` preflight** (this iteration): scans every `SignerInfo` entry across all txs at the start of `loadAppState`. Rejects the genesis if two different addresses claim the same account number, or if a `SignerInfo` claims a number reserved by a balance-init account at a different address. Defense-in-depth against a malformed genesis silently corrupting state. - **`InitChainerConfig.StrictReplay`** (this iteration): opt-in fail-closed boot. Defaults to `false` for backwards compat. Hardfork operators set it to `true` so any non-skipped tx replay failure aborts `InitChain` instead of letting the chain boot in a corrupted state. Skipped txs (`metadata.Failed = true`) do not count. - **Genesis-mode tx sig verify with PastChainIDs[0]** genesis-mode txs (no metadata or `BlockHeight == 0`) use the first `PastChainIDs` entry for sig verify when a hardfork is in progress (PR #5540). The genesis-mode chain-ID branch is now gated on `metadata == nil` (this iteration) so migration txs (`metadata != nil`, `BlockHeight == 0`, `Timestamp != 0`) keep their metadata-driven `ctxFn` instead of being silently overwritten. - **`BaseApp.InitChain` error surfacing** (this iteration): when `InitChainer` returns `ResponseInitChain.Error`, return cleanly instead of falling through to the validators-count sanity check, which would otherwise panic with a misleading `"validators count mismatch"` and mask the real cause. - **Replay report** per-tx categorization emitted via logger after `InitChain`: `ok` / `ok_gas_differs` / `failed` / `skipped_failed`. Exposes `Outcomes()` and `FailedCount()` for external tooling. ### Hardfork tooling (`contribs/gnogenesis/internal/fork/`) - **`gnogenesis fork generate`** generate a hardfork genesis from a source chain (RPC URL, local data dir, or exported tarball). - **`gnogenesis fork test`** local genesis replay smoke-test. - **`--patch-realm PKGPATH=SRCDIR`** (repeatable) rewrite a genesis-mode `addpkg` tx in-place with files from `SRCDIR`. Lets you deliver realm upgrades as part of the fork (e.g. adding a new `.gno` file to an existing realm) since you cannot re-addpkg post-deploy (PR #5540). - **`--migration-tx`** inject a single migration tx at the end of the historical replay. - **`bruteForceSignerSequence`** resolve signer sequences during export by trying candidate values against the signature. ## Bugs found and fixed during review ### tm2 consensus (all fixed) 1. **Fast-sync broken with InitialHeight > 1** `BlockPool` started at `store.Height()+1 = 1` instead of `state.LastBlockHeight+1 = InitialHeight`. Nodes trying to fast-sync would request non-existent blocks. 2. **Validator set / consensus params not saved at InitialHeight** `saveState` only saved validators when `nextHeight == 1`. With InitialHeight > 1, `LoadValidators` failed and `LoadConsensusParams` panicked at block InitialHeight+1. 3. **`ValidateBasic` bypass via zeroed `LastBlockID`** any block with `LastBlockID.IsZero()` could skip commit validation. Fixed: only allow skip when commit is also nil/empty. 4. **`BaseApp.validateHeight` permanent contiguity bypass** the previous "allow monotonic jump" branch compared real block height against the multistore version. After the first commit, `actual > prevHeight` is trivially true on every subsequent block, so the contiguity check was bypassed forever (an attacker or buggy consensus engine that skipped N blocks would be silently accepted). Fixed by tracking real chain height in `lastBlockHeight` (this iteration). 5. **`BaseApp.InitChain` masking real error** when `loadAppState` returned an error response, the validators-count sanity check fired with `"validators count mismatch"` masking the actual cause. Fixed: return cleanly on error response (this iteration). ### gno.land (all fixed) 6. **`loadAppState` returns nil even on N tx failures** chain booted in a corrupted state when historical-tx replay had failures. Fixed via opt-in `StrictReplay` in `InitChainerConfig` (this iteration). 7. **Migration-tx `ctxFn` overwrite** the genesis-mode chain-ID branch fired on any `metadata.BlockHeight == 0`, stomping the metadata-driven `Timestamp` override on migration txs. Fixed: tighten predicate to `metadata == nil` and compose with any prior `ctxFn` (this iteration). 8. **`NewAccountWithNumber` had no SignerInfo collision check** two `SignerInfo` entries with the same `AccountNum` but different addresses, or a `SignerInfo` colliding with a balance-init account, would silently zero the original account's balance. Fixed: rename to `NewAccountWithUncheckedNumber` (forcing every call site to acknowledge the precondition) plus `validateSignerInfo` preflight in `loadAppState` (this iteration). 9. **Failed-tx `ResponseDeliverTx` was empty (looked like success)** explicit error marker so indexers can distinguish. 10. **`GnoGenesisState.InitialHeight` wasn't cross-checked against `GenesisDoc.InitialHeight`** added `InitialHeight` to `RequestInitChain` and validate in `loadAppState`. 11. **`RequestInitChain.InitialHeight` had no amino round-trip test** silent registration regression would only surface during a real hardfork (this iteration). ### Hardfork tooling (fixed) 12. **`applyOverlay` silent no-op** listed scripts but didn't execute them, returned success. Fixed: returns error when scripts found but execution not implemented. 13. **JSONL serialization used `encoding/json` instead of amino** interface types (`std.Msg`) lost on round-trip. Fixed: both writer and reader now use amino. 14. **`verifyGenesisFile` failure returned success** tool could produce invalid genesis and exit 0. Fixed: failure aborts (opt out with `--no-verify`). 15. **Zero unit tests for `bruteForceSignerSequence`** fixed: 10 table-driven tests. ### Docs linter (side fix for green CI) - Skip `staging.gno.land`, `archive.org`, and add retry/timeout logic so transient remote-link failures don't block unrelated PRs. ## Still open (design / follow-up) - **RPC retry/resume** (`contribs/gnogenesis/internal/fork/source_rpc.go`) a single transient error during tx fetch aborts everything; needs exponential backoff + checkpointing. Architectural, follow-up PR. - **Streaming tx export** full tx history is held in memory; will OOM on large chains. Needs streaming writer, follow-up PR. - **`queryAccountAtHeight` silent nil** all error paths return nil with no indication; flaky RPC → wrong sequence metadata. ## Cherry-picked from [#5597](#5597) (this iteration) Three follow-ups originally staged in the master-based hardfork series, brought back to where they belong since they modify or extend code introduced here: - [`1babfe42a`](1babfe42a) `fix(consensus): skip phantom heights during replay when InitialHeight > 1` — ABCI handshake replay path used to assume heights `[1, appBlockHeight+1]` always have a stored block; for chains starting at `InitialHeight > 1`, heights below `InitialHeight` never had blocks and replay errored with "block not found for height 1". - [`5bf2fa53e`](5bf2fa53e) `fix(gnogenesis): default gas-storage params and gas_replay_mode in hardfork genesis` — `buildHardforkGenesis` now defaults the post-#5415 `vm.params` gas-storage fields from `vm.DefaultParams()` when the source has them all at zero, and sets `gas_replay_mode = "source"` when unset. Operator overrides preserved. 4 unit tests. - [`e31268467`](e31268467) `feat(gnogenesis): add --skip-failing-genesis-txs and --skip-genesis-sig-verification flags to fork test` — `make smoketest` now matches what production validators actually run. ## End-to-end validation The hf-glue testbed ([#5486](#5486)) runs `make fetch && make init && make up` against `rpc.gno.land` halt@704052 and produces a 192 MB hardfork genesis that replays with **0 / 2715 tx failures** and boots a live `gnoland-1` node. ## Dependencies / related PRs - **Depends on / pairs with:** [#5533](#5533) (`contribs/tx-archive` metadata + `SignerInfo` populator) for replay-ready backups - **Used in:** [#5486](#5486) (hf-glue testbed) - **Also fixed here:** [#5539](#5539) (docs-linter skip staging preemptive fix, committed here too to keep CI green) ## AI disclosure Developed with significant assistance from Claude Code for testing, review, and iterative fixes. --------- Co-authored-by: moul <noreply@moul.io> Co-authored-by: jaekwon <jae@tendermint.com> assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: aeddi <antoine.e.b@gmail.com> merging for moul
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Three tightly-related improvements that fell out of end-to-end-testing the hardfork-replay mechanism on the full gnoland1 chain (see #5486). Each is a standalone commit; happy to split into separate PRs if reviewers prefer.
1. `fix(tm2/sdk): BaseApp validateHeight + Info handle InitialHeight > 1`
Two separate issues hit chains whose genesis sets `InitialHeight > 1`:
2. `feat(gnoland): genesis-mode txs use PastChainIDs[0] for sig verify`
During a hardfork replay, genesis-mode txs (metadata == nil or BlockHeight == 0) were originally signed with the source chain's chain-id. Historical txs (BlockHeight > 0) already get `PastChainIDs`-based chain-id override in `loadAppState`; this extends the same treatment to genesis-mode txs by using the first `PastChainIDs` entry when a hardfork is in progress.
In practice this still needs `--skip-genesis-sig-verification` for gnogenesis-produced addpkg txs (where `msg.Creator ≠ signing key` — the pubkey-address check rejects those regardless of chain-id). But for genesis-mode txs where the signer IS the creator, this makes the signature verify against the correct chain-id without any skip flag.
3. `feat(hardfork): --patch-realm flag`
Repeatable `--patch-realm PKGPATH=SRCDIR` flag on `hardfork genesis`. Rewrites the genesis-mode addpkg tx for `PKGPATH` in-place, replacing `Package.Files` with the `*.gno` + `gnomod.toml` files from `SRCDIR`. Source genesis on disk stays untouched — patch lives only in the in-memory `GnoGenesisState` used for the output.
Motivation: you cannot re-addpkg to the same path post-deploy (unauthorized), and you cannot add a new `.gno` file to an existing realm via a call, so the only way to land a code change on an existing realm during a hardfork is to rewrite the addpkg tx that originally deployed it.
Combined with #5368 (which adds `halt.gno` to `r/sys/params`), the hf-glue testbed boots a fork of gnoland1 where `r/sys/params` ships the new `NewSetHaltRequest` code out of the box:
```
$ curl ... vm/qfile gno.land/r/sys/params
→ fee_collector.gno, gnomod.toml, halt.gno, params.gno, unlock.gno
```
End-to-end validation
All three land together in #5486 (hf-glue testbed). Running `make fetch && make init && make up` against `rpc.gno.land` / halt @ 704052 produces a 192 MB hardfork genesis that replays with 0 / 2715 tx failures and boots a live `gnoland-1` node with `r/sys/params` carrying the patched source.
Dependencies:
AI disclosure
Developed with assistance from Claude Code.