Efficient bytecode commitment with dory precommitted geometry#1344
Efficient bytecode commitment with dory precommitted geometry#1344RadNi wants to merge 57 commits into
Conversation
b732431 to
a0ca612
Compare
| if debug_bytecode_reduction_enabled() { | ||
| tracing::info!( | ||
| "BytecodeClaimReduction cache cycle len={} bound_value={} bound_eq={} scale={} cycle_claim={} synced_cycle_claim={}", | ||
| self.value_poly.len(), | ||
| self.value_poly.get_bound_coeff(0), | ||
| self.eq_poly.get_bound_coeff(0), | ||
| self.scale, | ||
| c_mid, | ||
| synced_cycle_claim, | ||
| ); | ||
| } |
There was a problem hiding this comment.
Seems like there's still a lot of debugging leftovers in the PR
| let bytecode_chunk = std::env::args() | ||
| .skip_while(|arg| arg != "--committed-bytecode") | ||
| .nth(1) | ||
| .map(|arg| arg.parse().unwrap()); | ||
| #[cfg(any(feature = "nostd", feature = "std"))] |
There was a problem hiding this comment.
I think we shouldn't include this in every example. I'd prefer to have one new example with committed bytecode in fib.
|
Warning This PR has more than 500 changed lines and does not include a spec. Large features and architectural changes benefit from a spec-driven workflow. If this PR is a bug fix, refactor, or doesn't warrant a spec, feel free to ignore this message. |
…1490) * docs: add bytecode expansion crate spec * docs: clarify bytecode parity cutover testing * docs: add formal verification readiness guidance * docs(spec): refine preprocessing scope Lock the crate split around jolt-program-preprocess as materialized program preprocessing. Clarify that PCS setup, commitment derivation, prover/verifier preprocessing keys, and bytecode commitment integration remain out of scope for this PR. Capture refactor/crates and bytecode commitment branches as future integration context. * docs(spec): add PR references Fill in the spec PR field for #1490 and link related PR references. Also update the status to in review and name bytecode-commitment PR #1344 precisely in non-goals and future integration notes. * docs(spec): polish implementation guidance Lock the final JoltProgramPreprocessing name throughout the spec. Clarify that InstrAssembler should borrow the expansion allocator mutably and document the dependency-light fallback for compute_min_ram_K. * docs(spec): reuse jolt-riscv for instructions Refactor the crate split spec to strengthen the existing jolt-riscv crate instead of introducing jolt-riscv-instructions. Update dependency direction, workspace layout, file lists, verification commands, and implementation checklist so the PR adds three new crates while moving static instruction data below tracer into jolt-riscv. * docs(spec): fuse program crate design * docs(spec): lock implementation constraints * docs(spec): clarify instruction boundary * docs(spec): move program row to jolt-program * docs(spec): use normalized instruction row * refactor: introduce normalized instruction row * docs(spec): add execution backend boundary * refactor(trace): isolate tracer backend boundary Add the jolt-program execution contract and move tracer-specific adapters into tracer. Keep jolt-trace tracer-free, with default backend selection specified for the SDK layer. * feat(program): add bytecode expansion crate Move RV64 program image decoding, bytecode expansion, and program preprocessing into jolt-program/jolt-riscv boundaries. Wire jolt-core through normalized preprocessing rows and keep tracer as the execution backend adapter. Add InlineExpansionProvider as a deliberate pivot from the approved spec so registered Jolt inline source opcodes can still flow through jolt-program expansion without making jolt-program depend on tracer or inline advice internals. * refactor(program): expose host execution backend hook Add a backend-neutral tracing hook on jolt-core host Program by building a jolt_program::ExecutableProgram and executing it through ExecutionBackend. Keep existing tracer-returning host conveniences intact while documenting the SDK default-backend cutover as the next integration step. * feat(program): finish expansion boundary spec Complete the remaining program-boundary work by routing generated SDK trace APIs through ExecutionBackend with TracerBackend selected in jolt-sdk, making analysis summaries use neutral trace rows, and returning final memory through the tracer backend adapter. Add jolt-eval bytecode expansion invariants and a decode_expand benchmark, then mark the spec acceptance checklist and implementation checklist complete with the documented prover-witness boundary caveat. * fix bytecode inline expansion alignment * refactor bytecode expansion modules * refactor(tracer): remove legacy expansion bodies Route tracer-side expanded execution through the jolt-program bytecode expansion bridge and remove the stale concrete per-instruction expansion implementations. Keep custom inline expansion as the tracer-owned provider path, and fill division advice by scanning generated virtual advice rows instead of relying on old concrete indexes. * fix(ci): update expansion call sites * refactor: trim bytecode expansion cleanup * fix(ci): avoid tracer test unused import * docs(expand): add extraction-native spec Document a follow-up production rewrite of jolt-program bytecode expansion for Hax/Aeneas friendliness, anchored to PR #1490 at a3448e6. The spec compares the current recursive assembler shape against an owned-state, shallow-lowering design with explicit work-stack recursion, bitset allocation, bounded buffers, and metadata stamping. * docs(expand): refine extraction grammar spec * docs(expand): align spec with MLIR lowering Frame bytecode expansion as a compiler lowering pipeline with explicit source, target, legality, and resource materialization phases. Document the near-term Rust-first boundary while keeping the grammar MLIR-ready for future dialect work. * docs(expand): define compiler-native pass scope * fix(expand): reject zero CSR rows Reject CSRRW/CSRRS rows with CSR address 0 instead of emitting a default normalized row that can be skipped by bytecode PC mapping. Remove the circular tracer expansion bridge test now that tracer delegates to jolt-program expansion, and update both bytecode expansion specs to document the CSR-zero correction and the non-circular testing strategy. * docs(bytecode): refine expansion follow-up spec * fix(program): address expansion review nits Tighten bytecode PC mapper validation, make the expanded-bytecode flag test assert parity, and apply small API cleanups from review. Also ignore Lake build artifacts when embedding zklean templates so workspace clippy is not affected by local .lake directories. * refactor(program): split expansion modules by instruction Move jolt-program expansion implementations out of broad category files and into per-instruction modules, keeping shared lowering helpers in local shared modules. No expansion behavior changes intended. * docs(program): clarify executable image stages * refactor(program): rename built program artifacts * refactor(tracer): adapt RAM access at boundary * refactor(riscv): fold trace traits into catalog * refactor(tracer): use riscv side effect metadata * refactor(program): address latest expansion review comments
Document the committed bytecode and program-image proving-system changes for PR a16z#1344, including preprocessing, precommitted scheduling, Stage 6a/6b, Stage 7, Stage 8, and BlindFold constraints. Co-authored-by: Cursor <cursoragent@cursor.com>
|
Claude code review session started: https://claude.ai/code/session_019JhWPTLBkqwnAEo3foHP5P |
moodlezoup
left a comment
There was a problem hiding this comment.
A few items from the review of the new precommitted machinery — mostly tech-debt cleanups, plus one defense-in-depth gap on preprocessing deserialization. No soundness issues found.
One follow-up that didn't fit inline: per the spec, the module header in transpilable_verifier.rs (lines 25–40, unchanged in this PR) still describes "Stages 1-6" / "Stage 6: CycleVariables phase", but verify_stage6a() and verify_stage6b() are the actual entry points now. The spec explicitly flags this: "Module comments in transpilable_verifier.rs should also be updated to describe Stage 6a and Stage 6b instead of the old monolithic Stage 6."
Generated by Claude Code
| fn compute_message_unscaled(&self, previous_claim_unscaled: F) -> UniPoly<F> { | ||
| let half = self.value_poly.len() / 2; | ||
| let evals: [F; TWO_PHASE_DEGREE_BOUND] = (0..half) | ||
| .into_par_iter() | ||
| .map(|j| { | ||
| let value_evals = self | ||
| .value_poly | ||
| .sumcheck_evals_array::<TWO_PHASE_DEGREE_BOUND>(j, BindingOrder::LowToHigh); | ||
| let eq_evals = self | ||
| .eq_poly | ||
| .sumcheck_evals_array::<TWO_PHASE_DEGREE_BOUND>(j, BindingOrder::LowToHigh); | ||
| let mut out = [F::zero(); TWO_PHASE_DEGREE_BOUND]; | ||
| for i in 0..TWO_PHASE_DEGREE_BOUND { | ||
| out[i] = value_evals[i] * eq_evals[i]; | ||
| } | ||
| out | ||
| }) | ||
| .reduce( | ||
| || [F::zero(); TWO_PHASE_DEGREE_BOUND], | ||
| |mut acc, arr| { | ||
| acc.iter_mut().zip(arr.iter()).for_each(|(a, b)| *a += *b); | ||
| acc | ||
| }, | ||
| ); | ||
| UniPoly::from_evals_and_hint(previous_claim_unscaled, &evals) | ||
| } | ||
|
|
||
| fn cycle_intermediate_claim(&self) -> F { | ||
| let len = self.value_poly.len(); | ||
| debug_assert_eq!(len, self.eq_poly.len()); | ||
| let mut sum = F::zero(); | ||
| for i in 0..len { | ||
| sum += self.value_poly.get_bound_coeff(i) * self.eq_poly.get_bound_coeff(i); | ||
| } | ||
| sum * self.scale | ||
| } | ||
|
|
||
| fn final_claim_if_ready(&self) -> Option<F> { | ||
| if self.value_poly.len() == 1 { | ||
| Some(self.value_poly.get_bound_coeff(0)) | ||
| } else { | ||
| None | ||
| } | ||
| } | ||
| } | ||
|
|
||
| impl<F: JoltField, T: Transcript> SumcheckInstanceProver<F, T> for BytecodeClaimReductionProver<F> { | ||
| fn get_params(&self) -> &dyn SumcheckInstanceParams<F> { | ||
| &self.params | ||
| } | ||
|
|
||
| fn round_offset(&self, max_num_rounds: usize) -> usize { | ||
| self.params.round_offset(max_num_rounds) | ||
| } | ||
|
|
||
| fn compute_message(&mut self, round: usize, previous_claim: F) -> UniPoly<F> { | ||
| let is_active_round = if self.params.is_cycle_phase() { | ||
| self.params.is_cycle_phase_round(round) | ||
| } else { | ||
| self.params.is_address_phase_round(round) | ||
| }; | ||
| if !is_active_round { | ||
| return UniPoly::from_coeff(vec![previous_claim * F::from_u64(2).inverse().unwrap()]); | ||
| } | ||
|
|
||
| let trailing_cap = if self.params.is_cycle_phase() { | ||
| self.params.cycle_alignment_rounds() | ||
| } else { | ||
| self.params.address_alignment_rounds() | ||
| }; | ||
| let num_trailing_variables = | ||
| trailing_cap.saturating_sub(self.params.num_rounds_for_current_phase()); | ||
| let scaling_factor = self.scale * F::one().mul_pow_2(num_trailing_variables); | ||
| let prev_unscaled = previous_claim * scaling_factor.inverse().unwrap(); | ||
| let poly_unscaled = self.compute_message_unscaled(prev_unscaled); | ||
| poly_unscaled * scaling_factor | ||
| } |
There was a problem hiding this comment.
compute_message_unscaled (341–366), cycle_intermediate_claim (368–376), final_claim_if_ready (378–384), compute_message (396–417), and ingest_challenge below all duplicate the same-named methods on PrecommittedProver in precommitted.rs:572–657. The only real differences are the bind_aux_polys(r_j) call in ingest_challenge and using num_rounds_for_current_phase() instead of num_rounds() in compute_message.
program_image.rs and advice.rs both compose PrecommittedProver and avoid this — bytecode should too. Adding an auxiliary-bind hook (e.g. fn bind_aux(&mut self, r_j) defaulting to no-op) on PrecommittedParams would let BytecodeClaimReductionProver become core: PrecommittedProver<F, _> + chunk_value_polys, eliminating ~75 lines that otherwise have to be kept in sync by hand.
Generated by Claude Code
| pub fn is_address_phase_active_round(&self, round: usize) -> bool { | ||
| self.address_phase_rounds.contains(&round) | ||
| } | ||
|
|
||
| #[inline] | ||
| pub fn is_address_phase_round(&self, round: usize) -> bool { | ||
| self.address_phase_rounds.contains(&round) | ||
| } |
There was a problem hiding this comment.
is_address_phase_active_round and is_address_phase_round have identical bodies. Pick one and delete the other — having both is a footgun for future readers who'll wonder what subtle distinction the naming implies.
| pub fn is_address_phase_active_round(&self, round: usize) -> bool { | |
| self.address_phase_rounds.contains(&round) | |
| } | |
| #[inline] | |
| pub fn is_address_phase_round(&self, round: usize) -> bool { | |
| self.address_phase_rounds.contains(&round) | |
| } | |
| #[inline] | |
| pub fn is_address_phase_round(&self, round: usize) -> bool { | |
| self.address_phase_rounds.contains(&round) | |
| } |
Generated by Claude Code
| pub fn round_offset(&self, is_cycle_phase: bool, max_num_rounds: usize) -> usize { | ||
| let _ = (is_cycle_phase, max_num_rounds); | ||
| 0 | ||
| } |
There was a problem hiding this comment.
round_offset ignores both arguments and unconditionally returns 0. It then flows through BytecodeClaimReductionParams::round_offset → BytecodeClaimReductionProver::round_offset (in the SumcheckInstanceProver impl), all returning 0. Either drop the method and have callers return 0 directly, or finish the abstraction. The current shape reads like an incomplete refactor.
Generated by Claude Code
| let memory_layout = MemoryLayout::deserialize_with_mode(&mut reader, compress, validate)?; | ||
| let max_padded_trace_length = | ||
| usize::deserialize_with_mode(&mut reader, compress, validate)?; | ||
| let bytecode_chunk_count = usize::deserialize_with_mode(&mut reader, compress, validate)?; |
There was a problem hiding this comment.
bytecode_chunk_count is read straight off the wire here with no validation, even though new_committed (line 2252) enforces nonzero / power-of-two / ≤256 / divides bytecode_len. The digest binds the value, so a tampered preprocessing fails Fiat-Shamir cleanly — but in the meantime a malformed serialized file panics deep inside chunk arithmetic instead of erroring at the read site. Calling validate_committed_bytecode_chunk_count(bytecode_chunk_count) (and the divisibility check) here, or in Valid::check, is cheap defense-in-depth.
Generated by Claude Code
| ram_K: usize, | ||
| scheduling_reference: PrecommittedSchedulingReference, | ||
| accumulator: &dyn OpeningAccumulator<F>, | ||
| _transcript: &mut impl Transcript, |
There was a problem hiding this comment.
_transcript: &mut impl Transcript is unused — callers still have to thread a transcript in only to have it discarded. Either use it (e.g. to absorb the start address or padded length) or drop it from the signature; an underscore-prefixed &mut parameter in a public API is misleading.
Generated by Claude Code
There was a problem hiding this comment.
Now that we have split this sumcheck into address-phase and cycle-phase instances, we should get rid of the old monolithic code entirely
There was a problem hiding this comment.
Same or the Booleanity sumcheck -- we should fully cutover to the split instances
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Update the stack spec to match the final branch slices, moved claim-reduction implementation scope, and expanded local CI coverage. Co-authored-by: Cursor <cursoragent@cursor.com>
Restore the feature spec text to its original contents; only the stack plan should describe the updated PR split and local CI scope. Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
…tack Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Use the cycle/address ordering expected by the active Dory layout when the main trace domain anchors Stage 8, so precommitted advice openings verify in AddressMajor mode. Co-authored-by: Cursor <cursoragent@cursor.com>
Keep the PR 02 Stage 8 verifier fix byte-for-byte aligned with the final merged branch while preserving the AddressMajor advice proof fix. Co-authored-by: Cursor <cursoragent@cursor.com>
Use local type inference for the dense AddressMajor setup tuple to avoid carrying an intermediate-only alias. Made-with: Cursor Co-authored-by: Cursor <cursoragent@cursor.com>
Move the common Stage 8 Dory opening-point selection into the commitment scheme module so prover and verifier use the same layout logic. Made-with: Cursor Co-authored-by: Cursor <cursoragent@cursor.com>
Cache real cycle-phase opening points for precommitted advice reductions so verifier phase transitions reconstruct the address phase from accumulator state. Made-with: Cursor Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Amirhossein Khajehpour <khajepour.amirhossein@gmail.com> Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Preserve the intended cycle-phase opening data for bytecode and program-image claim reductions across verifier transitions. Made-with: Cursor Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Amirhossein Khajehpour <khajepour.amirhossein@gmail.com> Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Amirhossein Khajehpour <khajepour.amirhossein@gmail.com> Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Update bytecode and program-image verifier integration to use the shared precommitted phase transition helper after cycle-phase openings are cached. Made-with: Cursor Co-authored-by: Cursor <cursoragent@cursor.com>
Thread committed bytecode metadata consistently through prover and verifier paths so bytecode reduction state stays synchronized. Made-with: Cursor Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Amirhossein Khajehpour <khajepour.amirhossein@gmail.com> Co-authored-by: Cursor <cursoragent@cursor.com>
91166a8 to
232fd15
Compare
Summary
This PR adds committed-program support for bytecode and program-image openings throughout the zkVM flow. This allows prover to efficiently prove bytecode / program-image polynomials openings instead of verifier directly evaluating them which significantly reduces the recursion cycles.
The main changes are: