Skip to content

Migrate hash sequence key to bigint#4264

Draft
ss-es wants to merge 2 commits intomainfrom
ss/increase-sequence-limit
Draft

Migrate hash sequence key to bigint#4264
ss-es wants to merge 2 commits intomainfrom
ss/increase-sequence-limit

Conversation

@ss-es
Copy link
Copy Markdown
Contributor

@ss-es ss-es commented May 6, 2026

We hit the sequence limit for the hash table on a decaf node, and it's likely we'll hit the same limit on mainnet/other decaf nodes in the next few months. This PR switches the postgres id to a bigint, which should resolve the issue.

No sqlite migration is needed since that is already a u64.

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 6, 2026

Claude finished @ss-es's task in 2m 2s —— View job


Code Review: Migrate hash sequence key to bigint

  • Read existing review comments
  • Review migration SQL
  • Review Rust code changes
  • Post review feedback

This is a clean, well-scoped migration. The changes are consistent across all layers (SQL migration, Rust types, SQL cast expressions) and cover all four tables that reference hash(id).

Migration (V1501)

The migration correctly:

  1. Widens the sequence with ALTER SEQUENCE hash_id_seq AS BIGINT (this also updates MAXVALUE to 2^63-1)
  2. Alters the primary key column hash.id to BIGINT
  3. Alters all four foreign-key hash_id columns across fee_merkle_tree, block_merkle_tree, reward_merkle_tree, and reward_merkle_tree_v2

I verified these are the only tables referencing hash(id) (defined in V14, V504, V1003). Nothing is missed.

One consideration for large tables: ALTER COLUMN ... TYPE BIGINT on a table where the existing type is INTEGER is an in-place rewrite in PostgreSQL — it requires an ACCESS EXCLUSIVE lock and rewrites the entire table. For a heavily populated decaf/mainnet node this could take a while and block reads. If downtime during migration is acceptable, this is fine. If not, you may want to consider running this during a maintenance window or documenting expected downtime. This is purely operational — the SQL itself is correct.

Rust Changes

All i32i64 changes are consistent:

  • Node.hash_id field type
  • batch_insert_hashes return type and internal mappings
  • Children deserialization (Vec<i64> + updated error messages)
  • HashMap<i64, Vec<u8>> for hash lookups
  • UNNEST($3::bigint[]) in the bulk insert SQL
  • Test schema in sql.rs updated to BIGINT
  • Both embedded-db (SQLite) and Postgres code paths updated

The children column is stored as JSONB (since V302), so the JSON deserialization to Vec<i64> is backward-compatible — existing i32-range values deserialize fine into i64.

Summary

Looks good to me. The PR is minimal and focused — it changes exactly what needs to change and nothing more. No issues found.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the hash_id column and its references from 32-bit integers to 64-bit integers (BIGINT/i64) across the database schema and the Rust codebase to prevent sequence exhaustion. This includes database migrations, struct definitions, and serialization logic. A concern was raised regarding the database migration strategy; performing a direct ALTER COLUMN TYPE on a large table can cause significant downtime due to table rewrites and exclusive locks, so a multi-step migration approach should be considered for production environments.

I am having trouble creating individual review comments. Click here to see my feedback.

crates/espresso/node/api/migrations/postgres/V1501__increase_hash_id_to_bigint.sql (5)

high

Performing ALTER COLUMN TYPE on a primary key column in a table that has reached the SERIAL limit (approximately 2 billion rows) is a very heavy operation in PostgreSQL. It will require a full table rewrite and will hold an AccessExclusiveLock on the hash table for the duration of the operation, which could take several hours depending on the hardware and table size.

If this migration needs to be performed on a live production system with minimal downtime, consider a multi-step approach:

  1. Add a new BIGINT column.
  2. Backfill the new column in batches.
  3. Create a trigger to keep the new column in sync.
  4. Swap the columns during a short maintenance window.

However, if the node is already failing due to the sequence exhaustion, this direct approach might be acceptable despite the downtime.

@ss-es ss-es marked this pull request as draft May 7, 2026 15:15
@ss-es
Copy link
Copy Markdown
Contributor Author

ss-es commented May 7, 2026

This seems to be extremely slow to run -- do not merge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants