Skip to content

refactor(bolt): compile Jolt verifier through typed plans#1523

Open
quangvdao wants to merge 171 commits into
a16z:jolt-v2/equivalencefrom
quangvdao:quang/bolt-stack
Open

refactor(bolt): compile Jolt verifier through typed plans#1523
quangvdao wants to merge 171 commits into
a16z:jolt-v2/equivalencefrom
quangvdao:quang/bolt-stack

Conversation

@quangvdao
Copy link
Copy Markdown
Contributor

@quangvdao quangvdao commented May 14, 2026

Stacks on top of Markos' jolt-v2/equivalence stack, currently based at 1faea61a (perf(equivalence): tighten Bolt prover gate). This collapses Quang's side of the stack into one PR: the earlier quang/bolt-stack work and the follow-up quang/bolt-verifier-program-refactor branch are combined here.

This PR advances the Bolt-generated Jolt verifier from stage-shaped generated Rust toward a typed verifier-program pipeline. It preserves the current full-field Jolt semantics and proof serialization, but moves verifier facts into compiler-owned typed plan data and a reusable runtime instead of late string matching and stage-local helper code.

Diff shape at the current head: 96 files changed, +30,835 / -13,674.

Stack shape

Before this update:

Markos stack:      jolt-v2/generated-roles -> jolt-v2/equivalence
Quang PR:          quang/bolt-stack
Quang follow-up:   quang/bolt-verifier-program-refactor

After the collapse:

Markos stack:      jolt-v2/generated-roles -> jolt-v2/equivalence
Single Quang PR:   quang/bolt-stack

Major changes

1. Typed generated verifier surface

  • Replaces stringly verifier surfaces with typed plan records for relation dispatch, field expressions, scalar/point operands, sumcheck claims, batch operands, opening claims, opening batches, point sources, eval families, and relation outputs.
  • Keeps strings primarily as diagnostic/serialization names; execution contracts now use typed rows, typed relation IDs, and typed value references.
  • Adds cleanup gates in crates/bolt/tests/verifier_cleanup.rs so converted surfaces stay converted: relation string sites, batch operand strings, claim input opening strings, point-concat input strings, field-expression operand constants, stage-local macros, indexed-eval prefix APIs, and handwritten expected-output helpers are all guarded.

2. bolt-verifier-runtime extraction

  • Adds crates/bolt-verifier-runtime as the shared generic verifier runtime crate.
  • Deletes generated crates/jolt-verifier/src/stages/common.rs and the old verifier_common.rs.template path.
  • Moves generic mechanics into the runtime: plan structs, ValueStore, transcript helpers, sumcheck driver plumbing, opening equality checks, point/scalar/vector/eval-family storage, relation-output execution scaffolding, and structural plan errors.
  • Keeps Jolt-specific verifier math in crates/jolt-verifier/src/stages/jolt_relations.rs and generated from verifier_jolt_relations.rs.template.

3. Top-level typed verifier program

  • Adds a top-level verifier-program model in crates/jolt-verifier/src/verifier.rs: proof slots, checkpoints, targets, evaluation policy, typed program steps, and execution artifacts.
  • Routes verifier entrypoints through typed program execution while preserving the public proof layout.
  • Treats numbered stages as proof-layout and diagnostic scopes, not as the only model for verifier control flow.

4. CPU-to-Rust verifier planning boundary (S2.75)

  • Adds compiler-side verifier planning modules such as rust_target_plan.rs, verifier_plan.rs, verifier_values.rs, verifier_sumcheck_rows.rs, verifier_opening_rows.rs, verifier_program_rows.rs, and related row/value modules.
  • Centralizes Stage 2-7 planning through shared planning functions for program steps, transcript flow, verifier sumchecks, value and relation outputs, opening flow, relation-local inputs, and target validation.
  • Moves verifier-mode sumcheck-flow, batch/driver consistency, relation-output validation, value-source conflict checks, and opening-flow checks onto VerifierStagePlan instead of repeated stage-local string sets and maps.
  • Routes Stage 2 through the shared planning boundary as well; it now uses the shared CPU row types and shared verifier-plan validation path.

5. Output claims, relation outputs, and typed value sources

  • Lifts Stage 3, Stage 4, Stage 5, Stage 6, and Stage 7 output-claim math into generated plan data.
  • Introduces typed scalar, point, field-vector, eval-family, local-scalar, local-input, and relation-output references.
  • Renames and reshapes the generated Rust surface around RelationOutputPlan, relation_outputs, and STAGE*_RELATION_OUTPUTS instead of older output-claim naming.
  • Names terminal expected_output scalar values separately from input sumcheck claim_value data.
  • Removes stale expected_stage67_* helper paths from generated/runtime code and gates handwritten_expected_output_functions: 0.
  • Starts lowering concrete relation-output values, including Stage 5/6 read-RAF paths, Stage 6 bytecode output terms, hamming and increment output scalars, and shared relation-local inputs.

6. Eval-family cutover (S4)

  • Adds explicit piop.sumcheck_eval_family, compute.sumcheck_eval_family, and cpu.sumcheck_eval_family rows.
  • Parses Stage 5 instruction read-RAF and Stage 6 bytecode read-RAF family membership from typed MLIR/CPU rows rather than oracle prefixes, raw eval ordering, or generated symbol spelling.
  • Emits NamedEvalFamilyPlan rows, seeds them into ValueStore as field-vector values, and reuses those vectors in relation-output plans.
  • Removes the old eval-prefix reconstruction path from generated verifier/runtime execution and gates against reintroducing indexed_evals_by_prefix* APIs.

7. Stage 5/6/7 read-RAF and bytecode planning seams

  • Adds stage5_instruction_read_raf_plan.rs and stage6_bytecode_read_raf_plan.rs as explicit compiler-side planning seams.
  • Moves bytecode read-RAF rows and output terms into typed plan constants rather than hand-authoring the generated Rust block directly in the Stage 6 emitter.
  • Keeps the bytecode evaluator quarantined on the Jolt side because the row encoding is protocol-specific.
  • Shares Stage 6/7 token helpers in plan_tokens.rs and replaces kernel ABI match blocks with table-driven ABI checks.

8. Shared polynomial helpers

  • Adds verifier-facing helper coverage in jolt-poly for indexed equality, less-than, identity, and related structured polynomial evaluation paths.
  • Reuses LtPolynomial and indexed equality helpers from the verifier runtime instead of open-coded local formulas.

9. Prover and equivalence adapters

  • Keeps prover/verifier proof serialization aligned while the verifier execution shape changes.
  • Updates jolt-prover stages, especially Stage 6/8 plumbing, to stay aligned with the generated role outputs and opening-flow shape.
  • Updates jolt-equivalence plan adapters, tamper tests, generated-stage adapters, and oracle wiring for typed verifier plans.
  • Enables the Stage 3+ Bolt parity/tamper gate by removing the stale ignore from bolt_stage3_batched_real_muldiv_self_parity.

10. Refactor plan docs

  • Rewrites crates/bolt/GOAL.md around the typed verifier-program objective, audit tiers, non-regression contracts, current S2.75-S5 status, and perf/readability/LOC gates.
  • Adds crates/bolt/VERIFIER_PROGRAM_REFACTOR_PLAN.md as the main S2-S6 verifier-program plan.
  • Adds crates/bolt/PROVER_PROGRAM_REFACTOR_PLAN.md to mirror the longer-term prover direction while keeping verifier work as the priority.

Latest recorded cleanup metrics

Latest recorded verifier_cleanup pass from the worklog after the Stage 2 verifier-plan cutover:

generated_surface_loc: 6055
tier_b_jolt_verifier_core_loc: 696
total_loc: 6751
relation_string_sites: 0
relation_indexed_eval_prefix_sites: 0
handwritten_expected_output_functions: 0

The earlier baseline was roughly 21.5k LOC of generated jolt-verifier, with Stage 6/7 alone at roughly 13.2k LOC. The current shape is much smaller and, more importantly, the remaining generated surface is mostly typed declarative data plus thin wrappers.

What this PR does not finish

This PR is still not the final verifier compiler architecture. The remaining risks are narrower and called out in VERIFIER_PROGRAM_REFACTOR_PLAN.md / GOAL.md:

  • Stage 2 now uses the shared verifier-plan boundary, but its relation-output formula table is still handwritten Rust planning data. The next completion audit should decide whether that is acceptable for this PR or must be made more declarative before closure.
  • Prover-mode Stage 2 validation remains stage-local for prover-only input-opening and kernel-binding details that are not part of the verifier runtime plan.
  • Eval-family membership is now explicit plan data for the Stage 5/6 consumers here; future family-shaped relations should follow the same typed-row rule instead of rebuilding prefix logic.
  • The bytecode row encoder remains Jolt-specific Tier B logic. S6 can make it typed table data later, but this PR intentionally stops short of generalizing bytecode semantics into the generic runtime.
  • The final S2.75-S5 completion audit still needs to separate true remaining architecture gaps from completion gates: semantic/tamper behavior, host/host+zk e2e, and perf stability.

Verification

Recorded local verification across the final slices includes:

cargo fmt --check
cargo check -p bolt -q
cargo check -p bolt -p jolt-verifier -p jolt-prover -p jolt-equivalence -q
cargo clippy -p bolt -p jolt-verifier -p jolt-prover -p jolt-equivalence -q --all-targets -- -D warnings
cargo nextest run -p bolt --test commitment_ir --cargo-quiet stage2_rust_targets_extract_and_compile stage3_rust_targets_extract_and_compile stage4_rust_targets_extract_and_compile stage5_rust_targets_extract_and_compile stage6_rust_targets_extract_and_compile stage7_rust_targets_extract_and_compile
cargo nextest run -p bolt --lib --cargo-quiet relation_output verifier_plan
cargo nextest run -p bolt --test verifier_cleanup --no-capture
cargo nextest run -p jolt-equivalence --test bolt_commitment --cargo-quiet
cargo clippy -p jolt-equivalence -q --test bolt_commitment -- -D warnings
cargo nextest run -p jolt-core muldiv --cargo-quiet --features host
cargo nextest run -p jolt-core muldiv --cargo-quiet --features host,zk
git diff --check

The latest three-sample 2^20 SHA2-chain perf oracle also passed:

JOLT_BOLT_PERF_SAMPLES=3 cargo nextest run -p jolt-equivalence --test bolt_perf --release --cargo-quiet --run-ignored only --no-capture bolt_sha2_chain_2_20_core_vs_bolt_perf_oracle

Latest sampled summary:

verify_ms ratio: 1.089x
prove_ms mean ratio: 1.168x, 95% CI [0.876x, 1.460x]
proof_bytes ratio: 1.363x
peak_rss_mb ratio: 1.339x

CI should still be treated as the source of truth for the full matrix.

Review notes

  • This is a full-cutover refactor, not a compatibility layer: old generated helper paths were deleted where their typed replacements landed.
  • bolt-verifier-runtime should stay boring and generic. New protocol facts should be represented as MLIR/codegen planning output, not runtime helpers that infer Jolt meaning from names.
  • Some generated declarative data grew while handwritten Rust shrank. That is intentional only when the generated data is typed, auditable, and compiler-owned.
  • The highest-value next step is the completion audit for S2.75-S5, especially Stage 2's handwritten formula table, semantic/tamper coverage, host/zk e2e, and perf margin.

AI-authored PR description update posted by Cursor assistant (model: GPT-5) on behalf of the user (Quang Dao) with approval.

@github-actions
Copy link
Copy Markdown
Contributor

Warning

This PR has more than 500 changed lines and does not include a spec.

Large features and architectural changes benefit from a spec-driven workflow.
See CONTRIBUTING.md for details on how to create a spec.

If this PR is a bug fix, refactor, or doesn't warrant a spec, feel free to ignore this message.

@github-actions github-actions Bot added the no-spec PR has no spec file label May 14, 2026
@quangvdao quangvdao changed the title refactor(bolt): type the Jolt verifier surface and split runtime into audit tiers refactor(bolt): type verifier surface and plan program model May 14, 2026
quangvdao and others added 26 commits May 14, 2026 21:47
Split the monolithic ~1.9k-LOC stages/common.rs into two files along an
explicit audit boundary:

  Tier A (Bolt verifier runtime):    stages/common.rs           1,265 LOC
    generic, protocol-agnostic plan structs, ValueStore,
    sumcheck driver loop, opening-equality interpreter,
    transcript helpers

  Tier B (audited Jolt verifier core): stages/jolt_relations.rs   638 LOC
    hand-written Jolt-specific verifier math: Stage 6/7
    evaluators, normalize_*_point, bytecode_gamma_powers,
    Stage67Bytecode* glue, polynomial-evaluation primitives

Tier B is the audit surface for Jolt-specific relation math; growth here
is now reviewed as a protocol-math decision rather than emitter LOC creep.
Tier A holds the generic Bolt scaffolding and is the long-term shrink
target as more helpers move into typed plan data driven from MLIR.

Wired through the artifact pipeline (verifier_runtime_modules now lists
both `common` and `jolt_relations`) and updated stage4/5/6 emitters to
split their import sites between super::common::{...} and
super::jolt_relations::{...}.

Per-tier hard LOC ceilings are enforced by verifier_cleanup.rs:
  BOLT_RUNTIME_BASELINE_LOC_CEILING       = 1_400  (current 1,265)
  JOLT_VERIFIER_CORE_BASELINE_LOC_CEILING =   700  (current 638)
  GENERATED_VERIFIER_TARGET_LOC           = 6_100  (current 6,002,
    bumped from 6,000 to absorb the import-split overhead)

GOAL.md gains an "Audit Tiers" section describing the A/B/C split and
records the post-split per-tier baseline. The pre-split "shared verifier
runtime" framing is retired.

cargo nextest -p bolt --test verifier_cleanup --test commitment_ir
cargo clippy -p bolt --all-targets -- -D warnings
cargo fmt --check
all green.

Co-authored-by: Cursor <cursoragent@cursor.com>
Add `crates/bolt/AUDIT_TIER_FOLLOWUPS.md` as the implementation plan for
the post-S1 verifier-cleanup track. Five sequenced slices:

  S2: Promote Tier A to a real `bolt-verifier-runtime` crate. Stops
      emitting it as a per-protocol template. Largest leverage, smallest
      semantics change. Removes ~1,265 LOC of "generated" code that was
      never per-protocol.

  S3: New compute-dialect ops for polynomial primitives
      (`poly_mle`, `poly_eq_indexed`, `poly_identity_eval`,
      `poly_lt_eval`, `poly_operand_eval`), point reorderings
      (`point_reverse`, `point_split`, `point_prefix`, `point_suffix`),
      and gamma-power vectors (`field_pow_vector`). Removes ~140 LOC of
      hand-written field-math from Tier B.

  S4: Typed indexed-eval addressing via `compute::sumcheck_eval_family`.
      Eliminates the last big string-dispatch site
      (`indexed_evals_by_prefix*`). No proof-format change required.

  S5: Lift `expected_stage67_*` evaluators to `compute::relation` typed
      plan data. Acknowledged Tier C growth (~200 LOC) traded for Tier B
      shrink (~200 LOC) on declarative-data-vs-hand-written-Rust grounds.
      Highest-risk slice; explicit pause before commit.

  S6: Bytecode-row encoding as typed plan data. Marked optional; recommend
      skip until a second protocol with the same shape exists.

Each slice has concrete plumbing notes, blocker analysis, acceptance
criteria, and rollback considerations. Cross-cutting sections cover
coordination with Markos' equivalence track, the trust-boundary
trajectory, MLIR dialect growth, performance, and `zk` feature
compatibility.

Adds a one-line cross-reference from `GOAL.md` so the plan is
discoverable from the canonical goal doc.

Co-authored-by: Cursor <cursoragent@cursor.com>
Rename the audit-tier follow-up plan to reflect the verifier-program refactor scope. Add non-regression contracts for readability, LOC, performance, semantic/tamper behavior, and fallback-free cutover.

Posted by Cursor assistant (model: GPT-5) on behalf of the user (Quang Dao) with approval.
Move the generated verifier common runtime into a standalone bolt-verifier-runtime workspace crate and depend on it from the generated verifier surface.

Genericize relation-bearing runtime plans over ProtocolRelation, move JoltRelationKind into the Jolt verifier layer, and make Stage 8 own its temporary source-stage enum.

Delete the generated stages/common.rs module and old verifier_common template, then update emitters, goldens, equivalence adapters, and cleanup gates for the full cutover.
Move Stage 3 verifier output-claim formulas from emitted helper functions into runtime-interpreted typed output-claim plans. Preserve proof serialization and update equivalence adapters plus generated artifacts.
Add first-class sumcheck output value and claim ops across Bolt IR, schema validation, Stage 3 lowering, kernel resolution, and CPU lowering. Stage 3 now emits protocol-owned output-claim plans, and the Rust emitter reads those ops instead of building verifier formulas by hand.\n\nReplace clone-based runtime output-claim evaluation with a scratch scalar overlay, and keep verifier-only output formula closures out of prover generated artifacts.
Add a reusable compiler-side field formula builder for protocol-owned verifier value plans, and use it to express Stage 3 output-claim formulas as declarative formula data while preserving emitted MLIR symbols and operation order.

Replace the output-claim scratch overlay with a named map-backed scratch store so verifier runtime evaluation no longer depends on cloned value stores as formula plans grow.
Rename the field formula helper types so they describe formula steps and operators rather than binary arity. This avoids implying binary-field semantics in the Stage 3 formula authoring path.
Introduce typed sumcheck output point plans with explicit segment, length, and order semantics, plus LT output value evaluation for Stage 4 RAM val checks.

Move Stage 4 register read/write and RAM val expected output claims into protocol-owned verifier plan data, cut the new shape through IRDL, schema, Rust emission, runtime, and equivalence adapters, and keep proof serialization unchanged.
Emit rustfmt-compatible verifier exports and remove the unused serde dependency from the generated verifier crate so CI fmt and machete agree with the generated artifact source of truth.
Rename sumcheck output values into structured polynomial eval plans across IR, schema, emitters, runtime, generated artifacts, and equivalence adapters. Treat LT as the same verifier-side structured polynomial vocabulary as Eq and EqPlusOne, with explicit x/y point staging and polynomial_evals output claims.
quangvdao added 28 commits May 16, 2026 07:59
Route relation-output local scalar lists through VerifierScalarValuePlan rows while preserving the generated verifier runtime string-slice surface. Update Stage 2, Stage 5, Stage 6, validation, and generated-plan adapters to consume the typed relation-output contract.
Keep relation-output local scalar rows behind constructor and accessor methods, and narrow scalar-value row types back to crate-local visibility.
Represent resolved relation-output expected values as typed scalar references and route emission, adapters, and tests through the accessor contract.
Add a typed scalar value set for verifier-stage plans and validate relation-output refs and local scalar plans against it instead of a flattened scalar source set.
Represent resolved relation-output eval, product, and function family scalar operands as VerifierScalarValueRef values and validate them through the scalar value set.
Represent relation-output product-family eval families with typed field-vector refs and validate them against indexed eval-family plan rows.
Remove the derived field-vector source-set view and validate verifier field-vector scalar expressions directly against typed field-vector values.
Represent verifier scalar-expression operands as typed scalar, point, or field-vector refs and validate verifier stages against those typed plan rows.
Represent verifier field-expression operands as typed scalar refs and validate verifier stages against typed field-expression plan rows.
Represent verifier point values as typed plan data and validate verifier scalar-expression point operands against point value refs.
Validate relation-output structured polynomial points through verifier point value refs and remove the unused point source adapter.
Classify Stage 2 verifier point symbols with VerifierPointValueSet and delete stale point source conflict helpers.
Validate Stage 2 verifier scalar expressions through typed verifier scalar and point value refs, and remove stale scalar source conflict helpers.
Carry relation-output plan rows with JoltVerifierRelationKind instead of raw strings and move repeated Stage 3-7 relation-output validation onto VerifierStagePlan.

Keep relation symbols as diagnostics/adapter data only; generated verifier execution continues to consume typed relation plans.
Move verifier-mode Stage 3-7 sumcheck batch and driver consistency checks onto VerifierStagePlan.

Stage-local verifier validation now keeps only role-shape checks, while prover kernel validation remains stage-owned.
Move verifier-mode Stage 3-7 opening-flow validation onto VerifierStagePlan.

Prover input-opening checks stay stage-local because verifier runtime claim rows do not carry those fields.
Record current SHA2-chain perf oracle results for the verifier-program refactor completion gate.

The 2^20 oracle is green on current tip but remains a fragile-margin risk.
@quangvdao quangvdao changed the title refactor(bolt): introduce typed verifier program pipeline refactor(bolt): compile Jolt verifier through typed plans May 17, 2026
Remove Jolt-specific point-order variants from the generic verifier runtime and route relation-local normalization through Jolt stage code. Make sumcheck eval observation require explicit eval names, cut Stage 2 verifier expression emission over to VerifierStagePlan helpers, and add cleanup gates for the runtime boundary and docs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

no-spec PR has no spec file

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant