feat(node): add GovDAO-based chain halt via r/sys/params#5368
Conversation
Enables GovDAO to propose a coordinated chain halt at a specific block height without requiring every operator to pass a CLI flag. This is the governance-driven counterpart to the --halt-height CLI flag. Changes: - Add `NewSetHaltHeightRequest(height int64)` to `r/sys/params` realm, allowing GovDAO to vote on halting the chain at a target block. - Add `nodeParamsKeeper` to validate `node:p:halt_height` params. - Register the "node" module in the params keeper so halt_height can be set via governance proposals. - Extend `EndBlocker` to read `node:p:halt_height` from the params store and call `osm.Kill()` when the halt height is reached. Usage: // Create and submit a GovDAO proposal to halt at block 100000 pr := params.NewSetHaltHeightRequest(100_000) id := dao.MustCreateProposal(cross, pr) // After approval and execution, all nodes will halt at block 100000 Generated with [Claude Code](https://claude.com/claude-code) via [Happy](https://happy.engineering) Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Happy <yesreply@happy.engineering>
🛠 PR Checks SummaryAll Automated Checks passed. ✅ Manual Checks (for Reviewers):
Read More🤖 This bot helps streamline PR reviews by verifying automated checks and providing guidance for contributors and reviewers. ✅ Automated Checks (for Contributors):No automated checks match this pull request. ☑️ Contributor Actions:
☑️ Reviewer Actions:
📚 Resources:Debug
|
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
Extends the GovDAO halt proposal with a mandatory minimum binary version
field. When set, nodes refuse to restart unless their version satisfies
the requirement, preventing an old binary from accidentally resuming a
chain that was halted for an upgrade.
- `NewSetHaltRequest(height, minVersion)` sets both `node:p:halt_height`
and `node:p:halt_min_version` atomically in one GovDAO proposal.
- `checkNodeStartupParams` runs at node startup (after state is loaded)
and compares `version.Version` against the stored `halt_min_version`.
- `meetsMinVersion` / `parseGnolandVersion` handle the "chain/gnolandX.Y"
version format used for gno.land chain releases, with a string-equality
fallback for other formats.
Example: setting minVersion="chain/gnoland1.1" will allow 1.1 and newer
to start, but reject 1.0 ("develop" also rejected unless it matches).
Generated with [Claude Code](https://claude.com/claude-code)
via [Happy](https://happy.engineering)
Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Happy <yesreply@happy.engineering>
|
Things to consider:
|
…es (#5334) ## Summary Adds a halt height mechanism for coordinated chain upgrades. The node stops after committing the specified block height. ### How to set it ```bash gnoland config set halt_height 352922 ``` Or edit `config.toml` directly: ```toml halt_height = 352922 ``` ### How it works 1. After `finalizeCommit`, consensus checks if `height >= halt_height` 2. If so, calls `osm.Kill()` for a graceful shutdown 3. The check is at the consensus level (not ABCI), following the same pattern as `WithEarlyStart` ### Scope and future direction This is a **temporary coordination tool** for the current chain upgrade. For the gnoland1 → gnoland-1 hard fork, validators set `halt_height` in their config, all nodes stop at the same block, then validators swap binary + config and restart. After the upgrade, the proper mechanism will be **GovDAO-based halting** (#5368), which adds: - On-chain `halt_height` param set via governance proposal (no manual config needed) - `halt_min_version` — prevents old binaries from restarting after halt - Version guard at startup so validators can't accidentally run the wrong binary Once #5368 is merged and active, `halt_height` in config becomes a **node operator tool** (e.g., "stop my node at height X for maintenance") rather than a coordination mechanism. Coordination should happen through governance. ### No CLI flag — config only Per @tbruyelle's suggestion, there's no `--halt-height` CLI flag. Config file is the single source of truth. This avoids the risk of validators missing the flag in duplicated command setups across their infrastructure. ### Related - #5368 — GovDAO-based halt height + version guard (Phase 2, replaces this for coordination) - #5376 — gnoland-1 chain config - #5411 — chain upgrade genesis replay <details> <summary>Contributors' checklist</summary> - [x] Added new tests, or not needed, or not feasible - [x] Provided an example (e.g. screenshot) to aid review or the PR is self-explanatory - [x] Updated the official documentation or not needed - [x] No breaking changes were made, or a `BREAKING CHANGE: xxx` message was included in the description - [x] Added `benchmarks` label to the PR or not needed </details>
Aligns with #5334's approach: GovDAO EndBlocker now sets the halt height on BaseApp, which panics in BeginBlock of the next block. This is deterministic (no async signals) and ensures the halted block is fully committed.
…es (#5334) ## Summary Adds a halt height mechanism for coordinated chain upgrades. The node stops after committing the specified block height. ### How to set it ```bash gnoland config set halt_height 352922 ``` Or edit `config.toml` directly: ```toml halt_height = 352922 ``` ### How it works 1. After `finalizeCommit`, consensus checks if `height >= halt_height` 2. If so, calls `osm.Kill()` for a graceful shutdown 3. The check is at the consensus level (not ABCI), following the same pattern as `WithEarlyStart` ### Scope and future direction This is a **temporary coordination tool** for the current chain upgrade. For the gnoland1 → gnoland-1 hard fork, validators set `halt_height` in their config, all nodes stop at the same block, then validators swap binary + config and restart. After the upgrade, the proper mechanism will be **GovDAO-based halting** (#5368), which adds: - On-chain `halt_height` param set via governance proposal (no manual config needed) - `halt_min_version` — prevents old binaries from restarting after halt - Version guard at startup so validators can't accidentally run the wrong binary Once #5368 is merged and active, `halt_height` in config becomes a **node operator tool** (e.g., "stop my node at height X for maintenance") rather than a coordination mechanism. Coordination should happen through governance. ### No CLI flag — config only Per @tbruyelle's suggestion, there's no `--halt-height` CLI flag. Config file is the single source of truth. This avoids the risk of validators missing the flag in duplicated command setups across their infrastructure. ### Related - #5368 — GovDAO-based halt height + version guard (Phase 2, replaces this for coordination) - #5376 — gnoland-1 chain config - #5411 — chain upgrade genesis replay <details> <summary>Contributors' checklist</summary> - [x] Added new tests, or not needed, or not feasible - [x] Provided an example (e.g. screenshot) to aid review or the PR is self-explanatory - [x] Updated the official documentation or not needed - [x] No breaking changes were made, or a `BREAKING CHANGE: xxx` message was included in the description - [x] Added `benchmarks` label to the PR or not needed </details>
|
Addressed points 1 and 3 from your feedback: 1. Panic if new binary is used before the halt height — 3. 2. Migrate function linked to Also aligned the GovDAO halt mechanism with #5334: replaced |
…es (#5334) ## Summary Adds a halt height mechanism for coordinated chain upgrades. The node stops after committing the specified block height. ### How to set it ```bash gnoland config set halt_height 352922 ``` Or edit `config.toml` directly: ```toml halt_height = 352922 ``` ### How it works 1. After `finalizeCommit`, consensus checks if `height >= halt_height` 2. If so, calls `osm.Kill()` for a graceful shutdown 3. The check is at the consensus level (not ABCI), following the same pattern as `WithEarlyStart` ### Scope and future direction This is a **temporary coordination tool** for the current chain upgrade. For the gnoland1 → gnoland-1 hard fork, validators set `halt_height` in their config, all nodes stop at the same block, then validators swap binary + config and restart. After the upgrade, the proper mechanism will be **GovDAO-based halting** (#5368), which adds: - On-chain `halt_height` param set via governance proposal (no manual config needed) - `halt_min_version` — prevents old binaries from restarting after halt - Version guard at startup so validators can't accidentally run the wrong binary Once #5368 is merged and active, `halt_height` in config becomes a **node operator tool** (e.g., "stop my node at height X for maintenance") rather than a coordination mechanism. Coordination should happen through governance. ### No CLI flag — config only Per @tbruyelle's suggestion, there's no `--halt-height` CLI flag. Config file is the single source of truth. This avoids the risk of validators missing the flag in duplicated command setups across their infrastructure. ### Related - #5368 — GovDAO-based halt height + version guard (Phase 2, replaces this for coordination) - #5376 — gnoland-1 chain config - #5411 — chain upgrade genesis replay <details> <summary>Contributors' checklist</summary> - [x] Added new tests, or not needed, or not feasible - [x] Provided an example (e.g. screenshot) to aid review or the PR is self-explanatory - [x] Updated the official documentation or not needed - [x] No breaking changes were made, or a `BREAKING CHANGE: xxx` message was included in the description - [x] Added `benchmarks` label to the PR or not needed </details>
tbruyelle
left a comment
There was a problem hiding this comment.
I've made a couple of tests with a node running on my machine, and there's some fixes to do.
1. Infinite loop after upgrade : the param halt_height is not cleared, so the chain continues to halt with the new binary.
2. Panic if new binary is used before the halt height : doesnt work, the condition is not correct, && binaryVersion != minVersion condition should be removed.
3. Cancel proposal didn't prevent halt : when a second proposal with halt_heigth=0 is submitted, it is supposed to cancel the halt, but it doesn't.
As improvement I would also add this:
4. Halt happened at halt_height + 1 instead of halt_height : I dont remember exactly why we decided in #5334 that height at halt_height should be fully committed (and so halt is actually happening at halt_height+1) but finally I found this confusing. In the SDK, halt_height is where the halt happens, not halt_height+1.
5. No validation that halt_height is in the future : WillSetParams allows old height to be stored, which will halt the chain immediately. We probably should prevent this.
I have all the fixes on my machine so it should be faster if you let me push them on your branch. Tell me what you prefer.
I wrote a test procedure so you can test on your side next time https://gist.github.com/tbruyelle/ce92e7d16ffb6af144f901e2203d2030
I think this kind of document is useful, we have a similar one for AtomOne, I can push it in this repo if you will, just tell me where it should be located.
|
Thank you, will work on it. Oh yes, please open a PR with a similar doc, looks like very useful. |
All issues found by Thomas during manual testing are now fixed: **1. Infinite loop after upgrade** Change EndBlocker condition from `>=` to `==`: only fires at the exact halt height, not on every subsequent block. After restart the chain is at halt_height+1 so the condition never re-fires — no more infinite loop. **2. New binary guard before halt height** Remove the erroneous `&& binaryVersion != minVersion` from `checkNodeStartupParams`. The intent is to block *any* binary that already satisfies the minimum version from running before the halt; the extra condition was accidentally allowing the exact minimum version through. **3. Cancel proposal now works** `halt_height=0` (cancel sentinel) combined with the `==` condition means `req.Height == 0` is never true, so cancellation is immediately effective. **4. Deterministic single-fire halt** Using `==` instead of `>=` ensures the halt triggers exactly once at the scheduled height. BeginBlock of the next block panics, leaving the chain cleanly stopped with halt_height as the last committed block. **5. Validate halt_height is in the future** `WillSetParam` now rejects a `halt_height` that is <= the current block height (except the cancel sentinel 0). Uses `safeBlockHeight()` to handle genesis / test contexts where the block header may not be set. New tests: - `TestCheckNodeStartupParams`: old binary guard, new binary guard, skip_upgrade_height bypass - `TestEndBlockerHalt`: exact-match halt, no re-halt after height, cancel sentinel, before-halt no-op - `TestNodeParamsKeeperWillSetParam`: past/present height rejection
|
[bot] Addressed all 5 issues from your review (commit 842e805): 1. Infinite loop after upgrade — Fixed by changing EndBlocker from 2. New binary guard doesn't work — Removed the erroneous 3. Cancel proposal doesn't prevent halt — Fixed as a side-effect of the 4. Halt at halt_height+1 — I kept the current semantics deliberately: 5. No validation that halt_height is in the future — Also added comprehensive tests: |
- TestNewAppWithOptions_ErrNoLogger / ErrNoEventSwitch: cover nil-logger and nil-event-switch validation paths in AppOptions.validate(). - TestEndBlocker/extract_updates_error: cover the path where the VM returns a regex-matching response with an unparseable address, causing extractUpdatesFromResponse to return an error and EndBlocker to return an empty response (lines 524-528 in app.go). - TestExtractUpdatesFromResponse: directly test all reachable error paths (empty response, no regex match, invalid bech32 address, invalid pubkey). EndBlocker coverage: 87.9% → 93.9% extractUpdatesFromResponse coverage: 83.3% → 94.4% node_params.go remains at 100%. The two remaining uncovered lines (u.Power < 0 check and strconv.ParseInt error in extractUpdatesFromResponse) are unreachable via normal flow: the regex pattern (\d+) only captures non-negative digits, so parsed powers are always ≥ 0 and ParseInt always succeeds.
tbruyelle
left a comment
There was a problem hiding this comment.
Very clean fix with req.Height == haltHeight instead of >= !
Crafted for testing gnolang#5368, I think this doc can useful in the future when upgrades will contain migration code.
Crafted for testing #5368, I think this doc can useful in the future for testing upgrades with migration code.
…ork time Adds a repeatable --patch-realm PKGPATH=SRCDIR flag to \`hardfork genesis\` that rewrites the genesis-mode addpkg tx for PKGPATH in-place, replacing its Package.Files with the *.gno + gnomod.toml files from SRCDIR. The source genesis on disk stays untouched — the patch lives only in the in-memory GnoGenesisState used to assemble the output. This is the "swap the source" mechanism for delivering realm upgrades as part of a hardfork: you cannot re-addpkg to the same path post- deploy (unauthorized), and you cannot add a new .gno file to an existing realm via a call, so the only way to land a code change is to rewrite the original addpkg that deployed the realm. hf-glue wires this up by default: PATCH_REALMS="gno.land/r/sys/params= \$REPO/examples/gno.land/r/sys/params". Combined with the merged PR gnolang#5368 (which adds halt.gno to that realm's examples), the forked chain boots with the new GovDAO halt-height mechanism available — verifying that the patch + merge round-trips: $ curl ... vm/qfile gno.land/r/sys/params → fee_collector.gno, gnomod.toml, halt.gno, params.gno, unlock.gno $ curl ... vm/qfile gno.land/r/sys/params/halt.gno → package params ... NewSetHaltRequest ... (PR gnolang#5368 code) Also merges PR gnolang#5368 (feat/govdao-halt-height) into this branch so the examples/ side has halt.gno to feed into the patch.
…ork time Adds a repeatable --patch-realm PKGPATH=SRCDIR flag to \`hardfork genesis\` that rewrites the genesis-mode addpkg tx for PKGPATH in-place, replacing its Package.Files with the *.gno + gnomod.toml files from SRCDIR. The source genesis on disk stays untouched — the patch lives only in the in-memory GnoGenesisState used to assemble the output. Motivation: you cannot re-addpkg to the same path post-deploy (unauthorized), and you cannot add a new .gno file to an existing realm via a call, so the only way to land a code change on an existing realm is to rewrite the original addpkg tx that deployed it. Example (tested end-to-end in the hf-glue testbed gnolang#5486): hardfork genesis --source /path/to/source \\ --patch-realm gno.land/r/sys/params=/src/examples/gno.land/r/sys/params \\ --chain-id gnoland-1 --output genesis.json Combined with gnolang#5368 (which adds halt.gno to r/sys/params), the forked chain boots with the new GovDAO halt mechanism available: $ curl ... vm/qfile gno.land/r/sys/params → fee_collector.gno, gnomod.toml, halt.gno, params.gno, unlock.gno Multiple --patch-realm flags can be combined to land several realm upgrades in one fork.
…IDs genesis-mode + --patch-realm) (#5540) Three tightly-related improvements that fell out of end-to-end-testing the hardfork-replay mechanism on the full gnoland1 chain (see [#5486](#5486)). Each is a standalone commit; happy to split into separate PRs if reviewers prefer. ## 1. \`fix(tm2/sdk): BaseApp validateHeight + Info handle InitialHeight > 1\` Two separate issues hit chains whose genesis sets \`InitialHeight > 1\`: - \`validateHeight\` compared \`req.Header.Height\` against the multistore version counter (which auto-increments from 0). With \`InitialHeight > 1\` the store version lags block height — first block arrives at e.g. 101 while store is at version 0, second at 102 while store is at 1 — so \`expected 2, got 102\` panicked. Fix: when the store version lags, accept the jump as long as height is monotonic. - \`Info()\` returned the multistore version as \`LastBlockHeight\`. On restart the handshaker saw \`appHeight=1\` but \`storeHeight=102\` and tried to replay from height 2. Fix: when the persisted header records a higher block height, return that instead. ## 2. \`feat(gnoland): genesis-mode txs use PastChainIDs[0] for sig verify\` During a hardfork replay, genesis-mode txs (metadata == nil or BlockHeight == 0) were originally signed with the source chain's chain-id. Historical txs (BlockHeight > 0) already get \`PastChainIDs\`-based chain-id override in \`loadAppState\`; this extends the same treatment to genesis-mode txs by using the first \`PastChainIDs\` entry when a hardfork is in progress. In practice this still needs \`--skip-genesis-sig-verification\` for gnogenesis-produced addpkg txs (where \`msg.Creator ≠ signing key\` — the pubkey-address check rejects those regardless of chain-id). But for genesis-mode txs where the signer IS the creator, this makes the signature verify against the correct chain-id without any skip flag. ## 3. \`feat(hardfork): --patch-realm flag\` Repeatable \`--patch-realm PKGPATH=SRCDIR\` flag on \`hardfork genesis\`. Rewrites the genesis-mode addpkg tx for \`PKGPATH\` in-place, replacing \`Package.Files\` with the \`*.gno\` + \`gnomod.toml\` files from \`SRCDIR\`. Source genesis on disk stays untouched — patch lives only in the in-memory \`GnoGenesisState\` used for the output. Motivation: you cannot re-addpkg to the same path post-deploy (unauthorized), and you cannot add a new \`.gno\` file to an existing realm via a call, so the only way to land a code change on an existing realm during a hardfork is to rewrite the addpkg tx that originally deployed it. Combined with [#5368](#5368) (which adds \`halt.gno\` to \`r/sys/params\`), the hf-glue testbed boots a fork of gnoland1 where \`r/sys/params\` ships the new \`NewSetHaltRequest\` code out of the box: \`\`\` $ curl ... vm/qfile gno.land/r/sys/params → fee_collector.gno, gnomod.toml, halt.gno, params.gno, unlock.gno \`\`\` ## End-to-end validation All three land together in [#5486](#5486) (hf-glue testbed). Running \`make fetch && make init && make up\` against \`rpc.gno.land\` / halt @ 704052 produces a 192 MB hardfork genesis that replays with **0 / 2715 tx failures** and boots a live \`gnoland-1\` node with \`r/sys/params\` carrying the patched source. Dependencies: - Stacks on #5511 (this PR's base): needs \`PastChainIDs\`, \`SignerInfo\`, \`InitialHeight\` genesis-doc field - Pairs with [#5533](#5533) (\`contribs/tx-archive\` metadata + SignerInfo populator) for the replay-ready backups ## AI disclosure Developed with assistance from Claude Code.
Summary
Adds a governance-driven mechanism for coordinated chain halts, complementing the
halt_heightconfig field (PR #5334, merged).New GovDAO proposal:
r/sys/params.NewSetHaltRequestThis atomically sets two params in one proposal:
node:p:halt_height(int64): Block to halt at. EndBlocker propagates this toBaseApp.SetHaltHeight(), which panics inBeginBlockof the next block — ensuring the halted block is fully committed.node:p:halt_min_version(string): Minimum binary version required to restart. If set, nodes refuse to start unlessbinary_version >= min_version, preventing old binaries from accidentally resuming a chain halted for an upgrade.Upgrade workflow
Phase 1 (PR #5334, merged —
halt_heightconfig field): Operators sethalt_heightinconfig.tomlviagnoland config set halt_height Nto coordinate a clean stop at a specific block.Phase 2 (this PR — GovDAO halt): GovDAO votes on
NewSetHaltRequest(height, minVersion). Once approved:halt_heightautomatically (no config change required)halt_min_versioncan resumeImplementation details
Gno side (
examples/gno.land/r/sys/params/halt.gno):NewSetHaltRequest(height int64, minVersion string)— creates GovDAO proposalheight=0, minVersion=""to cancel a scheduled haltGo side:
nodeParamsKeeper(new) — validatesnode:p:halt_heightandnode:p:halt_min_versionparamscheckNodeStartupParams(new) — called at startup after state load; comparesversion.Versionagainsthalt_min_versionmeetsMinVersion/parseGnolandVersion— handleschain/gnolandX.Yversion format; falls back to exact match for other formatsEndBlockerextended — readsnode:p:halt_heightand callsBaseApp.SetHaltHeight()(deterministic panic in next BeginBlock, no async signals)Tests
TestNodeParamsKeeperWillSetParam: validates all param types and error casesTestMeetsMinVersion: 12 test cases covering version comparison logicTestParseGnolandVersion: 7 test cases for version string parsingDependencies
Built on top of PR #5334 (merged). Uses the same
BaseApp.SetHaltHeight()+BeginBlockpanic mechanism for deterministic halts.