Skip to content

docs(test13): hardfork RC series, master-based (review-only against master)#5597

Closed
aeddi wants to merge 174 commits into
gnolang:masterfrom
aeddi:chain/test13-rc5-master
Closed

docs(test13): hardfork RC series, master-based (review-only against master)#5597
aeddi wants to merge 174 commits into
gnolang:masterfrom
aeddi:chain/test13-rc5-master

Conversation

@aeddi
Copy link
Copy Markdown
Contributor

@aeddi aeddi commented Apr 26, 2026

Not meant to be merged. This PR is opened against master purely so reviewers can use GitHub's compare view — it gives a single click-through diff for every rc-master branch in the series and an easy "Files changed" tab. The actual work lands by cherry-picking from the rc branches into their respective upstream PRs (or via a coordinated hardfork from master directly). Don't press the green button.

A stacked branch series that carries the gnoland1 → test-13 hardfork work, rebased onto master instead of chain/gnoland1. Each rc-master is a strict superset of the previous one — branch it off, add a focused delta, keep the older rcs intact so bisecting and reviewing never lose history. Open this PR against the highest rc-master (chain/test13-rc5-master) unless you're reviewing one specific layer in isolation.

This stack is the master-based companion to PR #5589 (the chain/gnoland1-based variant). Both produce a bootable test-13 genesis from the same gnoland1 history; this one inherits everything master has merged since chain/gnoland1 was cut — notably the gas-storage refactor (#5415), the gas-model recalibration (#5291), type-persistence dedup (#5544), bptree (#5475), account sessions (#5307), and PR #5485 (valset-via-params v3) which has now landed on master — at the cost of one extra layer of launch-readiness fixes that the older base didn't need.

A design/rationale writeup for what's specific to the master-basing lives in the ADR: gno.land/adr/pr5597_test13_hardfork_launch_master_based.md. The full launch playbook — migration sequence 01 → 07, replay posture, intentional divergences, launch verification — is mostly identical to #5589's ADR; the new one only documents what's different.

Status (post #5485 integration): validated end-to-end — make smoketest clean (0 failures, was 44 before the storage-deposit fix), 2-node cluster boots and advances past the genesis block in lockstep with peers connected, make assert-migrations 9 of 9 ok, make verify-reproducibility byte-identical SHA across two clean builds, keyless validator mode (gnokms-style: VALIDATOR_ADDRESS + VALIDATOR_PUBKEY) produces a bootable genesis without a priv_validator_key.json on disk. See ADR §"Status" for the full validation matrix.

Branch graph

Each row links to the branch tree and shows the cumulative diff against the immediate predecessor (what this rc-master adds).

master

chain/test13-base-master (diff vs master) — hf-glue testbed (PR #5486) merged with master + the two test13-specific commits that aren't already on master

chain/test13-rc1-master (diff vs chain/test13-base-master) — tier-1 audits (migrations, tx export, state, balances)

chain/test13-rc2-master (diff vs chain/test13-rc1-master) — tier-2 resilience (reproducibility, nil-mpkg, validator ops, halt)

chain/test13-rc3-master (diff vs chain/test13-rc2-master) — tier-3 audits (realm imports, gas modes, repro-doc fix)

chain/test13-rc4-master (diff vs chain/test13-rc3-master) — test13-specific deploy plumbing for r/sys/validators/v3 (the realm itself now lives on master via merged #5485)

chain/test13-rc5-master (diff vs chain/test13-rc4-master) — master-specific launch-readiness fixes + #5485-integration adapters + this stack's ADR

What changed vs PR #5589's rc series

chain/test13-base-master

Delta vs master: branches from master, merges in the PR #5486 hf-glue testbed branch (rather than re-importing it as a squashed diff like #5589's chain/test13-base did), and adds the two test13-specific commits that weren't already either on master or on #5486:

  • a7e897e58 feat(hf-glue): add chunked tx-archive fetch script for flaky RPCs
  • 4ddbcb0c4 feat(deployments/gnoland-1): govDAO T1 rotation migration + repair valset-reset

Plus a single conflict-resolution merge commit (4f92182bd) for master#5486 — three files, all overlapping additions kept from both sides (tm2/pkg/sdk/auth/keeper.go, tm2/pkg/sdk/auth/keeper_test.go, tm2/pkg/sdk/baseapp.go), and a merge for the post-#5485 master (9b19bf353) — four conflicts in gno.land/pkg/gnoland/{app.go,app_test.go,mock_test.go,node_params.go} resolved as union merges of the hardfork-replay machinery and the new params-keeper-driven valset architecture.

The rest of the diff (~200 commits) is what's on master since chain/gnoland1 was cut and what's on #5486 since its base.

chain/test13-rc1-master

Delta vs chain/test13-base-master: same four tier-1 audit/assertion tools as #5589's chain/test13-rc4. None of the replay failures panic the chain (absorbed by --skip-failing-genesis-txs), but silent divergence could ruin a launch. These close that hole:

Delta vs master (cumulative)

  • 22a42fc7a feat(hf-glue): add assert-migrations script verifying post-replay state — one positive check per migration step's intended effect
  • fc2c4a030 feat(hf-glue): add verify-txs-jsonl integrity check vs source-chain RPC — asserts total_txs + spot-checks random heights
  • 5d82ddea2 feat(hf-glue): add state-diff tool comparing replay vs source-chain realms — renders each realm on both sides + diffs
  • 395830bba feat(hf-glue): add audit-balances diffing per-signer ugnot source vs replay — surfaces the accounts drained by post-mainnet storage-deposit semantics

chain/test13-rc2-master

Delta vs chain/test13-rc1-master: tier-2 resilience work from #5589's chain/test13-rc5. Proves the genesis rebuilds deterministically, teaches gno to survive mid-write process kills without crash-looping, adds every remaining validator-ops primitive beyond add, and patches one missing corner in the rc1 assertion script.

Delta vs master (cumulative)

  • 6b0783b35 feat(hf-glue): add verify-reproducibility building genesis twice, asserting SHA match — local proxy for the cross-machine attestation validators will run at launch
  • 24e7d7f94 fix(gnovm): survive partial mempackage writes on restart — defensive nil-skip in PreprocessAllFilesAndSaveBlockNodes so a half-persisted store boots instead of crash-looping (rc5-master adds the producer-side body-first ordering that complements this)
  • dc68c9bdd feat(deployments/test13.gno.land): add rm/change-power/batch govDAO scripts — full valset-ops surface against v3
  • 4641a41a8 fix(hf-glue): accept empty string as "no pending update" in assert-migrations

chain/test13-rc3-master

Delta vs chain/test13-rc2-master: tier-3 audits and a documentation fix for a debugging pitfall, plus the cherry-pick of #5589's launch ADR for shared context.

Delta vs master (cumulative)

  • a968c59e4 feat(hf-glue): add audit-realm-imports flagging dangling imports post-fork — scans every addpkg tx (historical + genesis-mode) for imports that no longer resolve against the current stdlib + examples tree
  • 4ce8766fa feat(hf-glue): add compare-gas-modes A/B-testing strict vs source replay
  • 999a9fcdf docs(hf-glue): clarify verify-reproducibility assumes a clean OUT dir
  • a2dccbe2a docs(test13): add gnoland1 hardfork launch ADR (pr5589_test13_hardfork_launch.md)

chain/test13-rc4-master

Delta vs chain/test13-rc3-master: test13-specific deploy + migration plumbing for r/sys/validators/v3. The realm itself now lives on master via merged #5485, so what remains here is what mainnet doesn't have yet — the addpkg of v3 wrapped in a sysnames-permission window, plus the operator-facing govDAO script.

When this layer was first proposed (earlier revisions of this PR) it carried the cherry-pick of #5485 itself plus a vm:p:valset_realm_path migration step plus an eager-eval bug fix; all three dropped out cleanly when #5485 landed on master — the realm is upstream, its NewProposalRequest already evaluates eagerly, and the configurable realm-path field was replaced with a hardcoded constant in r/sys/params/valset.gno (so the migration step that wrote the param became a no-op against a removed field).

Delta vs master (cumulative)

  • 1444441f7 feat(deployments/gnoland-1): addpkg r/sys/validators/v3 in post-fork migration — wires migration step 06 against the upstream-merged v3 source under examples/gno.land/r/sys/validators/v3
  • ff79b0f64 fix(deployments/gnoland-1): wrap v3 addpkg with sysnames-check disable/restore — gnoland1's r/sys/names.enabled = true at halt rejects any addpkg under the sys namespace; migration steps 05/07 govDAO-flip vm:p:sysnames_pkgpath to "" and back around step 06 to permit one-shot deployment without permanently weakening the namespace check
  • 985295c1d feat(deployments/test13.gno.land): add add-validator.sh targeting r/sys/validators/v3 — operator script to grow the post-fork valset via govDAO
  • 9c81af5ae fix(test13.gno.land/add-validator.sh): use renamed v3 NewProposalRequest API — adapts to master's valr.NewProposalRequest(fn, title, description) signature (the cherry-pick used the older NewValsetChangeExecutor + dao.NewProposalRequest two-call pattern)

chain/test13-rc5-master

Delta vs chain/test13-rc4-master: master-specific launch-readiness fixes + the #5485-integration adapter commits surfaced by the master rebase. The chain/gnoland1 base predates the gas-storage refactor (#5415) and the partial-mpkg write-amplification pattern that rc5-master closes — none of these are needed in #5589's stack but they are required for the master stack to boot end-to-end without manual JSON patching. See ADR §1 for the full context.

Delta vs master (cumulative)

Genesis assembly fixes:

  • d4a163644 fix(gnogenesis): default gas-storage params and gas_replay_mode in hardfork genesis — buildHardforkGenesis populates the seven post-fix(tm2,gnovm,gno.land): gas storage #5415 vm.params fields from vm.DefaultParams() when the source has them all at zero (the pre-refactor signature) and sets gas_replay_mode = "source" when unset. Operator overrides preserved. 4 unit tests.
  • 311b3971c feat(gnogenesis): add --skip-failing-genesis-txs and --skip-genesis-sig-verification flags to fork testmake smoketest now matches what production validators actually run
  • ffe5adee9 fix(gnogenesis): top up patched-addpkg creator balance for storage deposit — patchGenesisModeAddPkg rewrites a genesis-mode addpkg in place; when the new files are larger than the original (master's r/sys/params grew by valset.gno from feat: valset updates via VM params keeper (v3) #5485), the original creator's balance — sized for the smaller original — falls short of realm_bytes × StoragePrice and the deploy fails with "insufficient coins", silently cascading to every realm that depends on r/sys/params. Fix tops up the creator's genesis balance to a conservative upper bound.
  • c3c1800d1 fix(gnogenesis): read storage_price from genesis params + add top-up regression tests — replaces the previous commit's hardcoded 100 ugnot/byte with a lookup of vm.Params.StoragePrice, with the same fallback for gnoland1's pre-fix(tm2,gnovm,gno.land): gas storage #5415 schema. 6 sub-tests covering all the edge cases.

Migration / replay fixes:

  • 6c4fbdb89 fix(deployments/gnoland-1): harden 01_reset_valset against v2 state drift — proposal callback consults live valr.IsValidator(addr) at execution time and only emits removals for validators actually present; one missing entry no longer panics the whole batch
  • 1bab5ca59 fix(gnovm/store): body-first AddMemPackage ordering + skip-don't-panic in IterMemPackage — write order is iavlStore body → baseStore index → counter, so a SIGKILL between any two writes is recoverable; IterMemPackage yields nil on inconsistency rather than panicking, complementing the consumer-side skip from rc2-master. Side effect: replay walltime drops from ~12 min to ~36 s. 3 unit tests.
  • 272dc5733 fix(deployments/gnoland-1): use gno-builtin address type in 01_reset_valset (no std import) — matches the pattern used by the sibling 04_withdraw_* template; importing "std" for std.ParseAddress doesn't type-check in genesis-mode MsgRun context

#5485-integration adapters (master-rebase fallout):

  • 69a474762 fix(test13): adapt scripts/ADR to renamed v3 API and removed migration 08 — updates rm-validator.sh, change-power.sh, batch-change.sh, assert-migrations.sh, and the rc5-master ADR to reference valr.NewProposalRequest(fn, title, desc) and node:valset:dirty (the old v3 cherry-pick used NewValsetChangeExecutor + vm:gno.land/r/sys/validators/v3:new_updates_available)
  • 37a93e295 fix(test13): adapt build.sh/init-node.sh/Makefile to master gnokey/gnoland CLI changes — master flipped gnokey maketx --broadcast default to true, requires --insecure-password-stdin even for the unencrypted ephemeral key, and gnoland config init no longer overwrites without -force. Without these, make migrate and make init both abort.
  • cd5b167fc fix(hf-glue): genesis target alias + state-diff normaliser for master stack traces — adds genesis: migrate alias for verify-reproducibility.sh, and 14 normaliser rules covering master's /gnoroot/, /usr/local/go/src/, /root/.cache/go-build/ paths plus :line numbers in panic traces (otherwise state-diff reports cosmetic divergences as semantic ones)

Source-chain compatibility (live test against gno.land):

Operator surface:

  • 81ad16295 fix(deployments/gnoland-1): also flip syscla_pkgpath in 05/07 disable/restore wrap — at higher halt heights, manfred's CLA signature for the then-current r/sys/cla hash may no longer be valid, so the v3 addpkg by manfred fails on checkCLASignature. Migration step 05 now also clears vm:p:syscla_pkgpath; step 07 restores it
  • 5cefecea5 feat(hf-glue): keyless validator mode via VALIDATOR_ADDRESS / VALIDATOR_PUBKEY for remote-signer setups — production validators run gnokms-backed Secp256k1 keys (no priv_validator_key.json on disk); keyless mode lets make migrate build the migrations.jsonl (valset reset + T1 rotation) using --address and --pubkey flags directly, with cross-check that the pubkey derives to the supplied address

Documentation:


How to review

  • If you want the conceptual launch playbook (migration sequence, replay posture, divergences), read #5589's ADR. Most of it applies to this stack unchanged; the master-specific deltas (migration 01 → 07 instead of 01 → 08, smoketest baseline, etc.) are in the master-based ADR.
  • For the launch fixes themselves, rc5-master is the only delta that doesn't have a chain/gnoland1-stack analog. The other rcs are direct rebases.
  • For the v3 valset deploy plumbing — rc4-master is one self-contained layer (docs(test13): hardfork RC series (review-only against chain/gnoland1) #5589 splits it across rc2 + rc3, and the realm cherry-pick that was here in earlier revisions is now upstream).
  • For the audit/tooling surface that would gate a production launch — rc1-master + rc2-master + rc3-master, same content as docs(test13): hardfork RC series (review-only against chain/gnoland1) #5589's rc4 + rc5 + rc6.

moul and others added 30 commits March 16, 2026 21:20
Signed-off-by: moul <94029+moul@users.noreply.github.com>
Co-authored-by: moul <94029+moul@users.noreply.github.com>
Co-authored-by: aeddi <antoine.e.b@gmail.com>
Co-authored-by: Antoine Eddi <5222525+aeddi@users.noreply.github.com>
Co-authored-by: Morgan <git@howl.moe>
Co-authored-by: Morgan Bazalgette <morgan@morganbaz.com>
Enables GovDAO to propose a coordinated chain halt at a specific block
height without requiring every operator to pass a CLI flag. This is the
governance-driven counterpart to the --halt-height CLI flag.

Changes:
- Add `NewSetHaltHeightRequest(height int64)` to `r/sys/params` realm,
  allowing GovDAO to vote on halting the chain at a target block.
- Add `nodeParamsKeeper` to validate `node:p:halt_height` params.
- Register the "node" module in the params keeper so halt_height can
  be set via governance proposals.
- Extend `EndBlocker` to read `node:p:halt_height` from the params
  store and call `osm.Kill()` when the halt height is reached.

Usage:
  // Create and submit a GovDAO proposal to halt at block 100000
  pr := params.NewSetHaltHeightRequest(100_000)
  id := dao.MustCreateProposal(cross, pr)

  // After approval and execution, all nodes will halt at block 100000

Generated with [Claude Code](https://claude.com/claude-code)
via [Happy](https://happy.engineering)

Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Happy <yesreply@happy.engineering>
Extends the GovDAO halt proposal with a mandatory minimum binary version
field. When set, nodes refuse to restart unless their version satisfies
the requirement, preventing an old binary from accidentally resuming a
chain that was halted for an upgrade.

- `NewSetHaltRequest(height, minVersion)` sets both `node:p:halt_height`
  and `node:p:halt_min_version` atomically in one GovDAO proposal.
- `checkNodeStartupParams` runs at node startup (after state is loaded)
  and compares `version.Version` against the stored `halt_min_version`.
- `meetsMinVersion` / `parseGnolandVersion` handle the "chain/gnolandX.Y"
  version format used for gno.land chain releases, with a string-equality
  fallback for other formats.

Example: setting minVersion="chain/gnoland1.1" will allow 1.1 and newer
to start, but reject 1.0 ("develop" also rejected unless it matches).

Generated with [Claude Code](https://claude.com/claude-code)
via [Happy](https://happy.engineering)

Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Happy <yesreply@happy.engineering>
…g#5400)

TypeCheckMemPackage only writes a package to permCache when it is
reached as a dependency via ImportFrom (canPerm=true). The root package
of each call is never self-stored. This left 22 "leaf" stdlibs (packages
not imported by any other stdlib, e.g. time, regexp, math/rand) absent
from vm.typeCheckCache on every node startup.

On a production cold-start node (LoadStdlib, CacheStdlibLoad=false), the
cache was entirely empty — every stdlib import in a user tx required a
GetMemPackage store read (8 gas/byte). On a restarted node (Initialize),
the 22 leaf stdlibs were still missing. This caused non-deterministic
gas consumption: nodes that had restarted disagreed with genesis-fresh
nodes on tx gas, triggering a consensus halt on gnoland1 at block
352922.

Fix: capture the *types.Package return value from each
TypeCheckMemPackage call in the init loop and store it directly into
opts.Cache. Applied to all three initialization paths: Initialize,
LoadStdlibCached, and LoadStdlib.

The LoadStdlib change additionally routes the cache to vm.typeCheckCache
directly (instead of the per-tx context clone) so the results survive
beyond the initialization transaction.

Verified by:
- TestTypeCheckCacheContainsAllStdlibs: asserts all InitOrder() stdlibs
are present in vm.typeCheckCache after both cold and warm
initialization.
- TestAddPkgGasWithTypeCheckCache: asserts identical gas for a
strconv-importing addpkg regardless of typeCheckCache state (was 7M cold
vs 2.1M warm before).
- addpkg_stdlib_typecheckcache.txtar: deploys a time-importing package
with gas_wanted=2700000; succeeds at ~2.3M with fix, OOGs at ~3.2M
without.

This is a hotfix, hence it is on chain/gnoland1 as the base. I fear this
may cause different gas results in the chain, so we still need to figure
out:

1. A migration strategy for the existing nodes (to re-run block 352922)
2. And also understanding the impact that this has on validators joining
in the network afterwards. I feel like this PR changes the gas values of
all of the transactions, including genesis transactions, so we got to
understand if nodes would still validate transactions correctly with
lower gas values or if they are no longer valid, and this would require
a chain re-start re-running the transactions.
…nolang#5409)

- Add `gnoland version` subcommand mirroring `gno version` and `gnokey
version`
- Add `BuildVersion`/`build_version` field to `ResultStatus` (RPC
/status endpoint), populated from `tm2/pkg/version.Version`
- Inject version via ldflags in Dockerfile, computed from git at build
time; all build stages now read from a shared build_version file written
in setup-gnocore

goreleaser already has injection of the version, so no changes needed
there.

---------

Co-authored-by: moul <94029+moul@users.noreply.github.com>
…ng#5410)

## Summary

Adds `contribs/gnobr` — a block rollback tool for gnoland validators. It
trims the blockstore to a target height, patches the app hash in
state.db, and wipes app state so gnoland replays all blocks locally on
restart. No network access or special binary patches needed.

### Usage

```bash
# Build from the gno repo
cd contribs/gnobr && go build -o gnobr .

# Stop your node, then run:
gnobr --data-dir gnoland-data --drop-after 352921 \
  --app-hash 14BD8BB9FAD9869B86F1BFFD1A16DD3A02C3534323F6E15121025BE5DFDC9C51

# Restart your node — it replays blocks 1..352921 locally from its own blockstore.
```

### What it does

1. **Trims blockstore.db** — removes all blocks after the target height
2. **Patches state.db** — updates the AppHash to the correct value (via
`--app-hash`) so the Handshaker doesn't panic on mismatch
3. **Wipes gnolang.db** — forces the app to replay from genesis
4. **Wipes WAL** — removes stale write-ahead log
5. **Resets priv_validator_state.json** — prevents double-signing

On restart, gnoland's Handshaker sees `appHeight=0, storeHeight=N,
stateHeight=N`, runs InitChain, then replays all N blocks from the local
blockstore. Zero network access needed.

### Flags

| Flag | Description |
|---|---|
| `--data-dir` | Path to gnoland data directory (default:
`gnoland-data`) |
| `--drop-after` | Keep blocks up to this height, drop everything after
|
| `--app-hash` | Hex-encoded app hash to write into state.db |
| `--dry-run` | Show what would be done without modifying anything |

### Why

During the gnoland1 chain halt at height 352922, validators committed a
block with a divergent app hash. The `chain/gnoland1.1` tag fixes the
root cause, but validators who committed the bad block can't just update
the binary — state.db contains the wrong app hash, causing a panic on
replay. This tool patches it cleanly.

### Tested

Successfully tested on gnoland1 (val1.moul.p2p.team):
- Restored from backup, ran gnobr, restarted with clean
`chain/gnoland1.1` binary (no patches)
- Node replayed all 352921 blocks locally, reached correct app hash
`14BD8BB9...`

<details>
<summary>Contributors' checklist</summary>

- [x] Added new tests, or not needed, or not feasible
- [x] Provided an example (e.g. screenshot) to aid review or the PR is
self-explanatory
- [x] Updated the official documentation or not needed
- [x] No breaking changes were made, or a `BREAKING CHANGE: xxx` message
was included in the description
- [x] Added `benchmarks` label to the PR or not needed
</details>
Aligns with gnolang#5334's approach: GovDAO EndBlocker now sets the halt height
on BaseApp, which panics in BeginBlock of the next block. This is
deterministic (no async signals) and ensures the halted block is fully
committed.
…es (gnolang#5334)

## Summary

Adds a halt height mechanism for coordinated chain upgrades. The node
stops after committing the specified block height.

### How to set it

```bash
gnoland config set halt_height 352922
```

Or edit `config.toml` directly:
```toml
halt_height = 352922
```

### How it works

1. After `finalizeCommit`, consensus checks if `height >= halt_height`
2. If so, calls `osm.Kill()` for a graceful shutdown
3. The check is at the consensus level (not ABCI), following the same
pattern as `WithEarlyStart`

### Scope and future direction

This is a **temporary coordination tool** for the current chain upgrade.
For the gnoland1 → gnoland-1 hard fork, validators set `halt_height` in
their config, all nodes stop at the same block, then validators swap
binary + config and restart.

After the upgrade, the proper mechanism will be **GovDAO-based halting**
(gnolang#5368), which adds:
- On-chain `halt_height` param set via governance proposal (no manual
config needed)
- `halt_min_version` — prevents old binaries from restarting after halt
- Version guard at startup so validators can't accidentally run the
wrong binary

Once gnolang#5368 is merged and active, `halt_height` in config becomes a
**node operator tool** (e.g., "stop my node at height X for
maintenance") rather than a coordination mechanism. Coordination should
happen through governance.

### No CLI flag — config only

Per @tbruyelle's suggestion, there's no `--halt-height` CLI flag. Config
file is the single source of truth. This avoids the risk of validators
missing the flag in duplicated command setups across their
infrastructure.

### Related

- gnolang#5368 — GovDAO-based halt height + version guard (Phase 2, replaces
this for coordination)
- gnolang#5376 — gnoland-1 chain config
- gnolang#5411 — chain upgrade genesis replay

<details>
<summary>Contributors' checklist</summary>

- [x] Added new tests, or not needed, or not feasible
- [x] Provided an example (e.g. screenshot) to aid review or the PR is
self-explanatory
- [x] Updated the official documentation or not needed
- [x] No breaking changes were made, or a `BREAKING CHANGE: xxx` message
was included in the description
- [x] Added `benchmarks` label to the PR or not needed
</details>
…ght config

Addresses tbruyelle's review feedback:
1. Panic if new binary runs before the chain has halted at halt_height
2. Add skip_upgrade_height config field to bypass the check when the
   validator has already migrated state
Prepares the repository for the gnoland1 → gnoland-1 hard fork:

- Add misc/deployments/gnoland-1/ with:
  - migrate-from-gnoland1.sh: placeholder with a detailed TODO covering
    halt verification, state export, migration transforms (r/sys/params,
    r/gnops/valopers, namereg, gas params), genesis assembly, verification,
    and restart coordination. Exits with an error until implemented.
  - config.toml: copy of gnoland1 config with meter_name=gnoland-1 and
    peer/seed addresses reset (to be filled post-fork).
  - govdao-scripts/: copies of gnoland1 scripts with CHAIN_ID=gnoland-1.
  - README.md: upgrade workflow, what changed, and ⚠️ migration TODO warning.

- Update docs:
  - docs/resources/gnoland-networks.md: Betanet chain ID gnoland1 → gnoland-1
  - docs/resources/gas-fees.md: update --chainid example
  - docs/users/explore-with-gnoweb.md: update Betanet chain ID reference

The migration script is the critical missing piece — the hard fork cannot
happen until it is written and dry-run on test12.

Generated with [Claude Code](https://claude.com/claude-code)
via [Happy](https://happy.engineering)

Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Happy <yesreply@happy.engineering>
- Revert premature doc references to gnoland-1 chain ID in gas-fees.md
  and explore-with-gnoweb.md (hardfork hasn't happened yet)
- Remove premature "Note" callout from gnoland-networks.md
- Update migrate-from-gnoland1.sh: reflect Scenario A decision (genesis
  tx-replay with InitialHeight), document blockers (gnolang#5411, gnolang#5390,
  Jae's InitialHeight tm2 work), reference issue gnolang#5374 for tracking
- Update gnoland-1/README.md: reflect correct PR merge status, document
  Scenario A approach, list migration blockers explicitly
- Add ChainID field to GnoTxMetadata for tx provenance recording
- Add InitialHeight validation (non-negative) to GenesisDoc.Validate and ValidateAndComplete
- Add test cases: no chain ID override when BlockHeight=0, no override when OriginalChainID unset
- Update ADR: document per-tx vs state-level design choice, mark InitialHeight as implemented end-to-end
…d-1 README

PR gnolang#5373 (valoper fee script) was closed without merging. The valoper
registration fee was already set to 0 via a GovDAO transaction on gnoland1,
so no code change is needed — the state is preserved in genesis replay.
- Fix comment headers: 'gnoland1' → 'gnoland-1' in add-validator.sh and rm-validator.sh
- Fix stale REMOTE default comment: 127.0.0.1:26657 → betanet endpoint
aeddi added 23 commits April 30, 2026 22:02
…aster

# Conflicts:
#	gno.land/pkg/gnoland/app.go
#	gno.land/pkg/gnoland/app_test.go
#	gno.land/pkg/gnoland/mock_test.go
#	gno.land/pkg/gnoland/node_params.go
…e/restore

Direct addpkg of r/sys/validators/v3 fails at the namespace-permission
check because gnoland1 has r/sys/names.enabled=true at halt height and no
address matches the 'sys' namespace. Splits the v3 deploy into three txs
(govDAO proposals that empty/restore vm:p:sysnames_pkgpath around the
addpkg), so the realm ends up on-chain without permanently weakening
namespace authz.
@aeddi aeddi force-pushed the chain/test13-rc5-master branch from 1154d7d to 8a37f4e Compare April 30, 2026 23:04
jaekwon pushed a commit that referenced this pull request May 7, 2026
## Overview

Chain hardfork mechanism for gno.land: export all state and historical
transactions from the source chain, replay them during `InitChain` on
the new chain, and start producing blocks at the halted height. Replaces
the original single-`OriginalChainID` design from
[#5411](#5411) with a more flexible
multi-chain model (`PastChainIDs` allowlist + per-tx `ChainID`).

**History:**
- Original work: [#5411](#5411)
- Jae's refinements:
[feat/genesis-replay-upgrade2](https://github.com/gnolang/gno/tree/feat/genesis-replay-upgrade2)
- This PR: builds on top of Jae's work, adds fixes from extensive review
+ end-to-end validation on the full gnoland1 chain via the [hf-glue
testbed](#5486)

## What's in

### tm2 (consensus + SDK)
- **`GenesisDoc.InitialHeight`** consensus starts block production at
this height after `InitChain`; `Handshaker` sets `state.LastBlockHeight
= InitialHeight - 1`.
- **`BlockchainReactor`, `state`, `store`, validation** all updated to
handle chains where `InitialHeight > 1` (empty block store,
non-contiguous block save, validator set / consensus params persisted at
InitialHeight, etc.)
- **`BaseApp.lastBlockHeight` tracker** (this iteration): real chain
height = `multistoreVersion + initialHeightOffset`, with the offset
persisted under `mainInitialHeightKey` and restored on every restart.
`validateHeight` now enforces strict contiguity against real chain
height; the previous "allow monotonic jump" branch (which permanently
bypassed contiguity for `InitialHeight > 1` chains) is gone.
- **`BaseApp.Info` guard** handle calls before the multistore is loaded.
- **`auth.SkipGasMeteringKey`** context flag that lets `SetGasMeter`
bypass the new VM's gas meter (used for `GasReplayMode="source"`).
- **`RequestInitChain.InitialHeight`** new ABCI field so the app can
cross-check against `GnoGenesisState.InitialHeight`. Amino round-trip
test added.

### gno.land
- **`GnoGenesisState`** extensions:
- `PastChainIDs []string` allowlist of past chain IDs valid for
signature verification
- `InitialHeight int64` cross-checked against `GenesisDoc.InitialHeight`
- `GasReplayMode string` `""`/`"strict"` (default, new VM's gas meter)
or `"source"` (bypass gas meter, preserve source-chain outcomes)
- **`GnoTxMetadata`** extensions:
  - `BlockHeight int64` original block height
  - `ChainID string` originating chain ID
- `Failed bool` tx had non-zero return code on source chain (skipped
during replay)
- `SignerInfo []SignerAccountInfo` per-signer account metadata (address,
account number, pre-tx sequence) so signatures verify correctly even if
earlier txs diverged
- `GasUsed`, `GasWanted int64` source-chain gas (populated by
tx-archive, used by replay report)
- **`auth.NewAccountWithUncheckedNumber`** (this iteration, renamed from
`NewAccountWithNumber`): create accounts with a specific number,
bypassing the auto-increment counter. Doc comment now spells out the
precondition that the caller must enforce uniqueness; the rename forces
every call site to acknowledge it.
- **`validateSignerInfo` preflight** (this iteration): scans every
`SignerInfo` entry across all txs at the start of `loadAppState`.
Rejects the genesis if two different addresses claim the same account
number, or if a `SignerInfo` claims a number reserved by a balance-init
account at a different address. Defense-in-depth against a malformed
genesis silently corrupting state.
- **`InitChainerConfig.StrictReplay`** (this iteration): opt-in
fail-closed boot. Defaults to `false` for backwards compat. Hardfork
operators set it to `true` so any non-skipped tx replay failure aborts
`InitChain` instead of letting the chain boot in a corrupted state.
Skipped txs (`metadata.Failed = true`) do not count.
- **Genesis-mode tx sig verify with PastChainIDs[0]** genesis-mode txs
(no metadata or `BlockHeight == 0`) use the first `PastChainIDs` entry
for sig verify when a hardfork is in progress (PR #5540). The
genesis-mode chain-ID branch is now gated on `metadata == nil` (this
iteration) so migration txs (`metadata != nil`, `BlockHeight == 0`,
`Timestamp != 0`) keep their metadata-driven `ctxFn` instead of being
silently overwritten.
- **`BaseApp.InitChain` error surfacing** (this iteration): when
`InitChainer` returns `ResponseInitChain.Error`, return cleanly instead
of falling through to the validators-count sanity check, which would
otherwise panic with a misleading `"validators count mismatch"` and mask
the real cause.
- **Replay report** per-tx categorization emitted via logger after
`InitChain`: `ok` / `ok_gas_differs` / `failed` / `skipped_failed`.
Exposes `Outcomes()` and `FailedCount()` for external tooling.

### Hardfork tooling (`contribs/gnogenesis/internal/fork/`)
- **`gnogenesis fork generate`** generate a hardfork genesis from a
source chain (RPC URL, local data dir, or exported tarball).
- **`gnogenesis fork test`** local genesis replay smoke-test.
- **`--patch-realm PKGPATH=SRCDIR`** (repeatable) rewrite a genesis-mode
`addpkg` tx in-place with files from `SRCDIR`. Lets you deliver realm
upgrades as part of the fork (e.g. adding a new `.gno` file to an
existing realm) since you cannot re-addpkg post-deploy (PR #5540).
- **`--migration-tx`** inject a single migration tx at the end of the
historical replay.
- **`bruteForceSignerSequence`** resolve signer sequences during export
by trying candidate values against the signature.

## Bugs found and fixed during review

### tm2 consensus (all fixed)
1. **Fast-sync broken with InitialHeight > 1** `BlockPool` started at
`store.Height()+1 = 1` instead of `state.LastBlockHeight+1 =
InitialHeight`. Nodes trying to fast-sync would request non-existent
blocks.
2. **Validator set / consensus params not saved at InitialHeight**
`saveState` only saved validators when `nextHeight == 1`. With
InitialHeight > 1, `LoadValidators` failed and `LoadConsensusParams`
panicked at block InitialHeight+1.
3. **`ValidateBasic` bypass via zeroed `LastBlockID`** any block with
`LastBlockID.IsZero()` could skip commit validation. Fixed: only allow
skip when commit is also nil/empty.
4. **`BaseApp.validateHeight` permanent contiguity bypass** the previous
"allow monotonic jump" branch compared real block height against the
multistore version. After the first commit, `actual > prevHeight` is
trivially true on every subsequent block, so the contiguity check was
bypassed forever (an attacker or buggy consensus engine that skipped N
blocks would be silently accepted). Fixed by tracking real chain height
in `lastBlockHeight` (this iteration).
5. **`BaseApp.InitChain` masking real error** when `loadAppState`
returned an error response, the validators-count sanity check fired with
`"validators count mismatch"` masking the actual cause. Fixed: return
cleanly on error response (this iteration).

### gno.land (all fixed)
6. **`loadAppState` returns nil even on N tx failures** chain booted in
a corrupted state when historical-tx replay had failures. Fixed via
opt-in `StrictReplay` in `InitChainerConfig` (this iteration).
7. **Migration-tx `ctxFn` overwrite** the genesis-mode chain-ID branch
fired on any `metadata.BlockHeight == 0`, stomping the metadata-driven
`Timestamp` override on migration txs. Fixed: tighten predicate to
`metadata == nil` and compose with any prior `ctxFn` (this iteration).
8. **`NewAccountWithNumber` had no SignerInfo collision check** two
`SignerInfo` entries with the same `AccountNum` but different addresses,
or a `SignerInfo` colliding with a balance-init account, would silently
zero the original account's balance. Fixed: rename to
`NewAccountWithUncheckedNumber` (forcing every call site to acknowledge
the precondition) plus `validateSignerInfo` preflight in `loadAppState`
(this iteration).
9. **Failed-tx `ResponseDeliverTx` was empty (looked like success)**
explicit error marker so indexers can distinguish.
10. **`GnoGenesisState.InitialHeight` wasn't cross-checked against
`GenesisDoc.InitialHeight`** added `InitialHeight` to `RequestInitChain`
and validate in `loadAppState`.
11. **`RequestInitChain.InitialHeight` had no amino round-trip test**
silent registration regression would only surface during a real hardfork
(this iteration).

### Hardfork tooling (fixed)
12. **`applyOverlay` silent no-op** listed scripts but didn't execute
them, returned success. Fixed: returns error when scripts found but
execution not implemented.
13. **JSONL serialization used `encoding/json` instead of amino**
interface types (`std.Msg`) lost on round-trip. Fixed: both writer and
reader now use amino.
14. **`verifyGenesisFile` failure returned success** tool could produce
invalid genesis and exit 0. Fixed: failure aborts (opt out with
`--no-verify`).
15. **Zero unit tests for `bruteForceSignerSequence`** fixed: 10
table-driven tests.

### Docs linter (side fix for green CI)
- Skip `staging.gno.land`, `archive.org`, and add retry/timeout logic so
transient remote-link failures don't block unrelated PRs.

## Still open (design / follow-up)

- **RPC retry/resume**
(`contribs/gnogenesis/internal/fork/source_rpc.go`) a single transient
error during tx fetch aborts everything; needs exponential backoff +
checkpointing. Architectural, follow-up PR.
- **Streaming tx export** full tx history is held in memory; will OOM on
large chains. Needs streaming writer, follow-up PR.
- **`queryAccountAtHeight` silent nil** all error paths return nil with
no indication; flaky RPC → wrong sequence metadata.


## Cherry-picked from [#5597](#5597)
(this iteration)

Three follow-ups originally staged in the master-based hardfork series,
brought back to where they belong since they modify or extend code
introduced here:

- [`1babfe42a`](1babfe42a)
`fix(consensus): skip phantom heights during replay when InitialHeight >
1` — ABCI handshake replay path used to assume heights `[1,
appBlockHeight+1]` always have a stored block; for chains starting at
`InitialHeight > 1`, heights below `InitialHeight` never had blocks and
replay errored with "block not found for height 1".
- [`5bf2fa53e`](5bf2fa53e)
`fix(gnogenesis): default gas-storage params and gas_replay_mode in
hardfork genesis` — `buildHardforkGenesis` now defaults the post-#5415
`vm.params` gas-storage fields from `vm.DefaultParams()` when the source
has them all at zero, and sets `gas_replay_mode = "source"` when unset.
Operator overrides preserved. 4 unit tests.
- [`e31268467`](e31268467)
`feat(gnogenesis): add --skip-failing-genesis-txs and
--skip-genesis-sig-verification flags to fork test` — `make smoketest`
now matches what production validators actually run.

## End-to-end validation

The hf-glue testbed ([#5486](#5486))
runs `make fetch && make init && make up` against `rpc.gno.land`
halt@704052 and produces a 192 MB hardfork genesis that replays with **0
/ 2715 tx failures** and boots a live `gnoland-1` node.

## Dependencies / related PRs

- **Depends on / pairs with:**
[#5533](#5533) (`contribs/tx-archive`
metadata + `SignerInfo` populator) for replay-ready backups
- **Used in:** [#5486](#5486)
(hf-glue testbed)
- **Also fixed here:** [#5539](#5539)
(docs-linter skip staging preemptive fix, committed here too to keep CI
green)

## AI disclosure

Developed with significant assistance from Claude Code for testing,
review, and iterative fixes.

---------

Co-authored-by: moul <noreply@moul.io>
Co-authored-by: jaekwon <jae@tendermint.com>
assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: aeddi <antoine.e.b@gmail.com>

merging for moul
@aeddi aeddi closed this May 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

a/everyone Affects every team 📖 documentation Improvements or additions to documentation 🤝 contribs 🐳 devops don't merge Please don't merge this functionality temporarily 📦 🌐 tendermint v2 Issues or PRs tm2 related 📦 ⛰️ gno.land Issues or PRs gno.land package related 📦 🤖 gnovm Issues or PRs gnovm related 🧾 package/realm Tag used for new Realms or Packages.

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

7 participants