Efficient bytecode commitment with dory precommitted geometry by RadNi · Pull Request #1344 · a16z/jolt

RadNi · 2026-03-17T05:15:59Z

Summary

This PR adds committed-program support for bytecode and program-image openings throughout the zkVM flow. This allows prover to efficiently prove bytecode / program-image polynomials openings instead of verifier directly evaluating them which significantly reduces the recursion cycles.

The main changes are:

add committed preprocessing and commitment plumbing for bytecode chunks and the program image
add bytecode and program-image claim reductions so these openings can be reduced into the shared Stage 8 Dory opening flow
introduce precommitted Dory geometry and total-var sizing across prover and verifier
split stage 6 into 6a and 6b to separate address and cycle phases
update SDK preprocessing helpers, examples, and docs for committed-bytecode flows
add bytecode committed variant of preprocessing to sdk and updated all the examples to support it

0xAndoroid · 2026-03-24T20:18:01Z

+            if debug_bytecode_reduction_enabled() {
+                tracing::info!(
+                    "BytecodeClaimReduction cache cycle len={} bound_value={} bound_eq={} scale={} cycle_claim={} synced_cycle_claim={}",
+                    self.value_poly.len(),
+                    self.value_poly.get_bound_coeff(0),
+                    self.eq_poly.get_bound_coeff(0),
+                    self.scale,
+                    c_mid,
+                    synced_cycle_claim,
+                );
+            }


Seems like there's still a lot of debugging leftovers in the PR

0xAndoroid · 2026-03-24T20:19:19Z

+    let bytecode_chunk = std::env::args()
+        .skip_while(|arg| arg != "--committed-bytecode")
+        .nth(1)
+        .map(|arg| arg.parse().unwrap());
+    #[cfg(any(feature = "nostd", feature = "std"))]


I think we shouldn't include this in every example. I'd prefer to have one new example with committed bytecode in fib.

0xAndoroid · 2026-03-24T20:21:19Z

Also, it's worth noting about #1260 effort that's being done right now. Given that #1260 is a rewrite of the codebase, rather than refactor, it might be better to adapt these changes into there once 1260 is merged.

github-actions · 2026-04-08T22:53:28Z

Warning

This PR has more than 500 changed lines and does not include a spec.

Large features and architectural changes benefit from a spec-driven workflow.
See CONTRIBUTING.md for details on how to create a spec.

If this PR is a bug fix, refactor, or doesn't warrant a spec, feel free to ignore this message.

Fill in the spec PR field for a16z#1490 and link related PR references. Also update the status to in review and name bytecode-commitment PR a16z#1344 precisely in non-goals and future integration notes.

…1490) * docs: add bytecode expansion crate spec * docs: clarify bytecode parity cutover testing * docs: add formal verification readiness guidance * docs(spec): refine preprocessing scope Lock the crate split around jolt-program-preprocess as materialized program preprocessing. Clarify that PCS setup, commitment derivation, prover/verifier preprocessing keys, and bytecode commitment integration remain out of scope for this PR. Capture refactor/crates and bytecode commitment branches as future integration context. * docs(spec): add PR references Fill in the spec PR field for #1490 and link related PR references. Also update the status to in review and name bytecode-commitment PR #1344 precisely in non-goals and future integration notes. * docs(spec): polish implementation guidance Lock the final JoltProgramPreprocessing name throughout the spec. Clarify that InstrAssembler should borrow the expansion allocator mutably and document the dependency-light fallback for compute_min_ram_K. * docs(spec): reuse jolt-riscv for instructions Refactor the crate split spec to strengthen the existing jolt-riscv crate instead of introducing jolt-riscv-instructions. Update dependency direction, workspace layout, file lists, verification commands, and implementation checklist so the PR adds three new crates while moving static instruction data below tracer into jolt-riscv. * docs(spec): fuse program crate design * docs(spec): lock implementation constraints * docs(spec): clarify instruction boundary * docs(spec): move program row to jolt-program * docs(spec): use normalized instruction row * refactor: introduce normalized instruction row * docs(spec): add execution backend boundary * refactor(trace): isolate tracer backend boundary Add the jolt-program execution contract and move tracer-specific adapters into tracer. Keep jolt-trace tracer-free, with default backend selection specified for the SDK layer. * feat(program): add bytecode expansion crate Move RV64 program image decoding, bytecode expansion, and program preprocessing into jolt-program/jolt-riscv boundaries. Wire jolt-core through normalized preprocessing rows and keep tracer as the execution backend adapter. Add InlineExpansionProvider as a deliberate pivot from the approved spec so registered Jolt inline source opcodes can still flow through jolt-program expansion without making jolt-program depend on tracer or inline advice internals. * refactor(program): expose host execution backend hook Add a backend-neutral tracing hook on jolt-core host Program by building a jolt_program::ExecutableProgram and executing it through ExecutionBackend. Keep existing tracer-returning host conveniences intact while documenting the SDK default-backend cutover as the next integration step. * feat(program): finish expansion boundary spec Complete the remaining program-boundary work by routing generated SDK trace APIs through ExecutionBackend with TracerBackend selected in jolt-sdk, making analysis summaries use neutral trace rows, and returning final memory through the tracer backend adapter. Add jolt-eval bytecode expansion invariants and a decode_expand benchmark, then mark the spec acceptance checklist and implementation checklist complete with the documented prover-witness boundary caveat. * fix bytecode inline expansion alignment * refactor bytecode expansion modules * refactor(tracer): remove legacy expansion bodies Route tracer-side expanded execution through the jolt-program bytecode expansion bridge and remove the stale concrete per-instruction expansion implementations. Keep custom inline expansion as the tracer-owned provider path, and fill division advice by scanning generated virtual advice rows instead of relying on old concrete indexes. * fix(ci): update expansion call sites * refactor: trim bytecode expansion cleanup * fix(ci): avoid tracer test unused import * docs(expand): add extraction-native spec Document a follow-up production rewrite of jolt-program bytecode expansion for Hax/Aeneas friendliness, anchored to PR #1490 at a3448e6. The spec compares the current recursive assembler shape against an owned-state, shallow-lowering design with explicit work-stack recursion, bitset allocation, bounded buffers, and metadata stamping. * docs(expand): refine extraction grammar spec * docs(expand): align spec with MLIR lowering Frame bytecode expansion as a compiler lowering pipeline with explicit source, target, legality, and resource materialization phases. Document the near-term Rust-first boundary while keeping the grammar MLIR-ready for future dialect work. * docs(expand): define compiler-native pass scope * fix(expand): reject zero CSR rows Reject CSRRW/CSRRS rows with CSR address 0 instead of emitting a default normalized row that can be skipped by bytecode PC mapping. Remove the circular tracer expansion bridge test now that tracer delegates to jolt-program expansion, and update both bytecode expansion specs to document the CSR-zero correction and the non-circular testing strategy. * docs(bytecode): refine expansion follow-up spec * fix(program): address expansion review nits Tighten bytecode PC mapper validation, make the expanded-bytecode flag test assert parity, and apply small API cleanups from review. Also ignore Lake build artifacts when embedding zklean templates so workspace clippy is not affected by local .lake directories. * refactor(program): split expansion modules by instruction Move jolt-program expansion implementations out of broad category files and into per-instruction modules, keeping shared lowering helpers in local shared modules. No expansion behavior changes intended. * docs(program): clarify executable image stages * refactor(program): rename built program artifacts * refactor(tracer): adapt RAM access at boundary * refactor(riscv): fold trace traits into catalog * refactor(tracer): use riscv side effect metadata * refactor(program): address latest expansion review comments

Document the committed bytecode and program-image proving-system changes for PR a16z#1344, including preprocessing, precommitted scheduling, Stage 6a/6b, Stage 7, Stage 8, and BlindFold constraints. Co-authored-by: Cursor <cursoragent@cursor.com>

github-actions · 2026-05-19T12:43:37Z

Claude code review session started: https://claude.ai/code/session_019JhWPTLBkqwnAEo3foHP5P

moodlezoup

A few items from the review of the new precommitted machinery — mostly tech-debt cleanups, plus one defense-in-depth gap on preprocessing deserialization. No soundness issues found.

One follow-up that didn't fit inline: per the spec, the module header in transpilable_verifier.rs (lines 25–40, unchanged in this PR) still describes "Stages 1-6" / "Stage 6: CycleVariables phase", but verify_stage6a() and verify_stage6b() are the actual entry points now. The spec explicitly flags this: "Module comments in transpilable_verifier.rs should also be updated to describe Stage 6a and Stage 6b instead of the old monolithic Stage 6."

Generated by Claude Code

moodlezoup · 2026-05-19T13:06:33Z

+    fn compute_message_unscaled(&self, previous_claim_unscaled: F) -> UniPoly<F> {
+        let half = self.value_poly.len() / 2;
+        let evals: [F; TWO_PHASE_DEGREE_BOUND] = (0..half)
+            .into_par_iter()
+            .map(|j| {
+                let value_evals = self
+                    .value_poly
+                    .sumcheck_evals_array::<TWO_PHASE_DEGREE_BOUND>(j, BindingOrder::LowToHigh);
+                let eq_evals = self
+                    .eq_poly
+                    .sumcheck_evals_array::<TWO_PHASE_DEGREE_BOUND>(j, BindingOrder::LowToHigh);
+                let mut out = [F::zero(); TWO_PHASE_DEGREE_BOUND];
+                for i in 0..TWO_PHASE_DEGREE_BOUND {
+                    out[i] = value_evals[i] * eq_evals[i];
+                }
+                out
+            })
+            .reduce(
+                || [F::zero(); TWO_PHASE_DEGREE_BOUND],
+                |mut acc, arr| {
+                    acc.iter_mut().zip(arr.iter()).for_each(|(a, b)| *a += *b);
+                    acc
+                },
+            );
+        UniPoly::from_evals_and_hint(previous_claim_unscaled, &evals)
+    }
+
+    fn cycle_intermediate_claim(&self) -> F {
+        let len = self.value_poly.len();
+        debug_assert_eq!(len, self.eq_poly.len());
+        let mut sum = F::zero();
+        for i in 0..len {
+            sum += self.value_poly.get_bound_coeff(i) * self.eq_poly.get_bound_coeff(i);
+        }
+        sum * self.scale
+    }
+
+    fn final_claim_if_ready(&self) -> Option<F> {
+        if self.value_poly.len() == 1 {
+            Some(self.value_poly.get_bound_coeff(0))
+        } else {
+            None
+        }
+    }
+}
+
+impl<F: JoltField, T: Transcript> SumcheckInstanceProver<F, T> for BytecodeClaimReductionProver<F> {
+    fn get_params(&self) -> &dyn SumcheckInstanceParams<F> {
+        &self.params
+    }
+
+    fn round_offset(&self, max_num_rounds: usize) -> usize {
+        self.params.round_offset(max_num_rounds)
+    }
+
+    fn compute_message(&mut self, round: usize, previous_claim: F) -> UniPoly<F> {
+        let is_active_round = if self.params.is_cycle_phase() {
+            self.params.is_cycle_phase_round(round)
+        } else {
+            self.params.is_address_phase_round(round)
+        };
+        if !is_active_round {
+            return UniPoly::from_coeff(vec![previous_claim * F::from_u64(2).inverse().unwrap()]);
+        }
+
+        let trailing_cap = if self.params.is_cycle_phase() {
+            self.params.cycle_alignment_rounds()
+        } else {
+            self.params.address_alignment_rounds()
+        };
+        let num_trailing_variables =
+            trailing_cap.saturating_sub(self.params.num_rounds_for_current_phase());
+        let scaling_factor = self.scale * F::one().mul_pow_2(num_trailing_variables);
+        let prev_unscaled = previous_claim * scaling_factor.inverse().unwrap();
+        let poly_unscaled = self.compute_message_unscaled(prev_unscaled);
+        poly_unscaled * scaling_factor
+    }


compute_message_unscaled (341–366), cycle_intermediate_claim (368–376), final_claim_if_ready (378–384), compute_message (396–417), and ingest_challenge below all duplicate the same-named methods on PrecommittedProver in precommitted.rs:572–657. The only real differences are the bind_aux_polys(r_j) call in ingest_challenge and using num_rounds_for_current_phase() instead of num_rounds() in compute_message.

program_image.rs and advice.rs both compose PrecommittedProver and avoid this — bytecode should too. Adding an auxiliary-bind hook (e.g. fn bind_aux(&mut self, r_j) defaulting to no-op) on PrecommittedParams would let BytecodeClaimReductionProver become core: PrecommittedProver<F, _> + chunk_value_polys, eliminating ~75 lines that otherwise have to be kept in sync by hand.

Generated by Claude Code

moodlezoup · 2026-05-19T13:06:37Z

+    pub fn is_address_phase_active_round(&self, round: usize) -> bool {
+        self.address_phase_rounds.contains(&round)
+    }
+
+    #[inline]
+    pub fn is_address_phase_round(&self, round: usize) -> bool {
+        self.address_phase_rounds.contains(&round)
+    }


is_address_phase_active_round and is_address_phase_round have identical bodies. Pick one and delete the other — having both is a footgun for future readers who'll wonder what subtle distinction the naming implies.

Suggested change

pub fn is_address_phase_active_round(&self, round: usize) -> bool {

self.address_phase_rounds.contains(&round)

}

#[inline]

pub fn is_address_phase_round(&self, round: usize) -> bool {

self.address_phase_rounds.contains(&round)

}

#[inline]

pub fn is_address_phase_round(&self, round: usize) -> bool {

self.address_phase_rounds.contains(&round)

}

Generated by Claude Code

moodlezoup · 2026-05-19T13:06:40Z

+    pub fn round_offset(&self, is_cycle_phase: bool, max_num_rounds: usize) -> usize {
+        let _ = (is_cycle_phase, max_num_rounds);
+        0
+    }


round_offset ignores both arguments and unconditionally returns 0. It then flows through BytecodeClaimReductionParams::round_offset → BytecodeClaimReductionProver::round_offset (in the SumcheckInstanceProver impl), all returning 0. Either drop the method and have callers return 0 directly, or finish the abstraction. The current shape reads like an incomplete refactor.

Generated by Claude Code

moodlezoup · 2026-05-19T13:06:44Z

        let memory_layout = MemoryLayout::deserialize_with_mode(&mut reader, compress, validate)?;
        let max_padded_trace_length =
            usize::deserialize_with_mode(&mut reader, compress, validate)?;
+        let bytecode_chunk_count = usize::deserialize_with_mode(&mut reader, compress, validate)?;


bytecode_chunk_count is read straight off the wire here with no validation, even though new_committed (line 2252) enforces nonzero / power-of-two / ≤256 / divides bytecode_len. The digest binds the value, so a tampered preprocessing fails Fiat-Shamir cleanly — but in the meantime a malformed serialized file panics deep inside chunk arithmetic instead of erroring at the read site. Calling validate_committed_bytecode_chunk_count(bytecode_chunk_count) (and the divisibility check) here, or in Valid::check, is cheap defense-in-depth.

Generated by Claude Code

moodlezoup · 2026-05-19T13:06:52Z

+        ram_K: usize,
+        scheduling_reference: PrecommittedSchedulingReference,
+        accumulator: &dyn OpeningAccumulator<F>,
+        _transcript: &mut impl Transcript,


_transcript: &mut impl Transcript is unused — callers still have to thread a transcript in only to have it discarded. Either use it (e.g. to absorb the start address or padded length) or drop it from the signature; an underscore-prefixed &mut parameter in a public API is misleading.

Generated by Claude Code

moodlezoup · 2026-05-19T13:50:22Z

Now that we have split this sumcheck into address-phase and cycle-phase instances, we should get rid of the old monolithic code entirely

moodlezoup · 2026-05-19T13:52:45Z

Same or the Booleanity sumcheck -- we should fully cutover to the split instances