fix(soundness): bind ShardRam y-sign to is_global_write by spherel · Pull Request #1344 · scroll-tech/ceno

spherel · 2026-05-25T01:23:33Z

Problem

Issue #1338 reproduces a soundness break on master. For the same RISC-V
execution, the base verifier and the recursion verifier both accept two
distinct proof batches whose public per-shard shard_rw_sum values differ
on all 17 shards. The attacker takes an honest witness, replaces every
cross-shard EC accumulator leaf (x, y) with its inverse (x, -y),
updates shard_rw_sum, and reproves.

Root cause: ceno_zkvm/src/tables/shard_ram.rs:276-281 was a TODO. The
host code in ShardRamRecord::to_ec_point encodes read vs write in the
sign of y[6], but the circuit only constrained the curve equation and
the EC sum — never tying y[6]'s half-of-field to is_global_write.
Both (x, y) and (x, -y) satisfied every existing check, so the public
summary of cross-shard RAM flow was unbound.

The defect survives recursion (the reporter's PoC verifies through the
recursion verifier program).

Design Rationale

Approach borrows the idea from SP1's
crates/core/machine/src/operations/global_interaction.rs:210-236,
not its column layout. Three pieces:

Offset by +1. Express y[6] in terms of a fresh witness y6_lo
so y[6] = 0 is never valid in either branch (it is invariant under the negate operation, thus make it impossible to distinguish read and write).
Safe band + prover retry. Restrict y6_lo to [0, (p-1)/2). For
the rare exception y[6] = 0 (probability ~1/p ≈ 2^-31 per record)
the host rejects and retries with a new nonce.
Byte-decomposition range check. y6_lo decomposed into four byte
limbs b0..b3 (assert_byte for b0..b2, lookup_ltu_byte(b3, 60, 1)
for b3). For BabyBear, (p-1)/2 = 60·2^24 exactly, so b3 < 60
gives the tightest no-overlap band.

In-circuit branch equality via condition_require_equal:

read (is_global_write = 0): y[6] = y6_lo + 1 ⇒ y[6] ∈ [1, (p-1)/2]
write (is_global_write = 1): y[6] = p - 1 - y6_lo ⇒ y[6] ∈ [(p+1)/2, p-1]

Union covers [1, p-1] with no overlap; y[6] = 0 is excluded.

Why not a single AssertLtConfig(y6_lo, (p-1)/2, max_bits=30)?
On BabyBear (p = 0x78000001, 31-bit) the AssertLt gadget only
constrains lhs - rhs ≡ diff - 2^max_bits (mod p) with diff ∈ [0, 2^30)
— it does not pre-bound lhs to be canonical-small. A malicious
y6_lo ∈ [0x74000001, p-1] (≈ 2^26 values) produces a field-wrap diff
that still fits in 30 bits, so the constraint accepts upper-half values
and the exploit survives. Byte-decomposing first kills the wrap. Ceno's
DynamicRangeTableCircuit<E, 18> also does not carry 30-bit lookup
entries, so a direct assert_const_range(_, 30) is not available
anyway.

Why M = 60 (vs SP1's 63). SP1 targets KoalaBear; its (p-1)/2 = 0x3f800000, so 63 leaves a small safety band. For BabyBear,
(p-1)/2 = 60·2^24 exactly — 63 would let y[6] straddle p/2 and
reintroduce the ambiguity.

Also corrects the stale comment that previously had the convention
reversed (claimed write ⇒ lower half, opposite of what the host code
does).

Change Highlights

`ceno_zkvm/src/tables/shard_ram.rs` — chip-level y-sign binding

ShardRamRecord::to_ec_point: reject y6 == 0 and try the next
nonce. Classify with strict y6 > prime / 2 so the boundary
(p-1)/2 correctly stays in the read region (a previous draft used
>= which misclassified that single boundary value and would have
produced an out-of-range y6_lo for both branches).
ShardRamConfig: new field y6_lo_bytes: [WitIn; 4].
ShardRamConfig::configure: replace the TODO with the byte
decomposition, byte-range / LTU lookups, and the
condition_require_equal branch equality.
ShardRamCircuit::assign_instance: compute y6_lo from y[6] and
is_to_write_set via a small y6_lo_value helper, assign byte
limbs, register byte and LTU multiplicities.
New test test_shard_ram_y_sign_circuit_rejects_negation drives
assign_instances_with_lk_multiplicities + MockProver over one
honest row and one sign-flipped row, asserting lookup_Ltu rejects
the tampered witness. A concrete challenge is supplied so the
no-challenge run path doesn't drop structural_witin.

Lookup-multiplicity plumbing for ShardRam

ShardRam's per-row y6_lo byte / LTU lookups must reach
combined_lk_mlt so the U8 / LTU table mlt columns balance.
ShardRam runs after opcode + dummy circuits, before
finalize_lk_multiplicities. To surface mlt without burdening every
other table circuit:

ceno_zkvm/src/tables/mod.rs: TableCircuit trait gains a second
default-unimplemented method
assign_instances_with_lk_multiplicities alongside the existing
assign_instances. ShardRam overrides the former; every other
table keeps overriding the latter.
ceno_zkvm/src/structs.rs: ZKVMWitnesses::assign_shared_circuit
threads a LkMultiplicity::default() through ShardRam's
parallel-chunk witgen and inserts
lk_multiplicity.into_finalize_result() into
lk_mlts["ShardRamCircuit"] before finalize. Asserts swap from
combined_lk_mlt.is_some() to is_none() to lock the ordering.
assign_table_circuit tolerates combined_lk_mlt = None by
passing an empty multiplicity slice, so LocalFinalCircuit (which
ignores the argument anyway) can also run before finalize.
ceno_zkvm/src/e2e.rs: move
MmuConfig::assign_continuation_circuit (LocalFinal + ShardRam) to
just before finalize_lk_multiplicities. Mirror the move inside
the GPU debug-compare block so combined_lk_mlt diff stays
meaningful.
ceno_zkvm/src/instructions/riscv/rv32im/mmu.rs: docstring updated
to describe the new ordering invariant.

Device-resident GPU shortcut for ShardRam (mlt mirror)

ZKVMWitnesses::try_assign_shared_circuit_gpu dispatches into
instructions::gpu::chips::shard_ram::try_gpu_assign_shared_circuit
to keep the continuation EC computation device-resident
(gpu_batch_continuation_ec_on_device + merge_and_partition_records)
when is_gpu_witgen_enabled(). The GPU kernels never enter the CPU
assign_instance per-row push, so the y6_lo lookup multiplicity is
derived host-side:

After step 6 of try_gpu_assign_shared_circuit (merge+partition),
D2H partitioned_buf once to Vec<u32> and walk it with stride
record_u32s = 26 (GpuShardRamRecord #[repr(C)] layout).
Per record extract is_to_write_set (u32 offset 10) and
point_y[6] (u32 offset 25), compute y6_lo, push the same
4 lookup queries the CPU path emits per row, then
into_finalize_result() and return alongside the chunked
Vec<ChipInput<E>>. debug_assert_eq!(record_u32s, 26) guards
against ceno_gpu layout drift.
try_assign_shared_circuit_gpu inserts both ChipInput and the
derived multiplicity into self.witnesses /
self.lk_mlts["ShardRamCircuit"] so finalize folds the GPU-path
contribution into combined_lk_mlt the same way the CPU shortcut
does.

Verifier: account for `has_ecc_ops` row doubling

ShardRamCircuit::has_ecc_ops() adds an extra hypercube variable;
the chip matrix has 2 * next_pow2(num_instance) rows where the
back half is EC-tree internal nodes with selector_zero = 0. Before
this fix the chip had num_lks = 0, so the verifier's
dummy_table_item_multiplicity correction never had to consider it.
With the new byte/LTU queries the correction under-counted dummy
lookups by a factor of 2 and shard verification failed with
logup_sum != 0.

ceno_zkvm/src/scheme/verifier.rs: multiply next_pow2_instance
by 2 when circuit_vk.get_cs().has_ecc_ops().
ceno_recursion/src/zkvm_verifier/verifier.rs: mirror the same
adjustment in the recursive verifier (lockstep per CLAUDE.md).

Benchmark / Performance Impact

Per ShardRam row this PR adds 4 byte WitIn columns plus 3 byte-range
and 1 LTU lookup multiplicities. ShardRam rows scale with cross-shard
RAM events, not with cycles, so the absolute cost is sub-percent on the
prover. No full prover bench was rerun (no hot-loop arithmetic changed).

Existing test_shard_ram_circuit (170k reads + 1420 writes, full chip
proof) runtime is unchanged within noise:

master   : ~5.0 s
this PR  : ~5.0 s

Testing

cargo fmt --all --check
cargo check --workspace --all-targets
cargo check --workspace --all-targets --release
cargo make clippy
cargo clippy --workspace --all-targets --release -- -D warnings
RUST_MIN_STACK=33554432 cargo test --workspace --lib --release
cargo run --release --package ceno_zkvm --features sanity-check --bin e2e -- \
  --platform=ceno --max-cycle-per-shard=20000 --hints=10 --public-io=4191 \
  examples/target/riscv32im-ceno-zkvm-elf/release/examples/fibonacci

All pass locally on BabyBear. test_shard_ram_circuit and
test_shard_ram_y_sign_circuit_rejects_negation are green. End-to-end
multi-shard fibonacci verifies ShardRamCircuit and LocalRAMTableFinal
on every shard with exit code 0. Success.

cargo make tests / cargo make tests_goldilock should be re-run by
CI; the change is gated to BabyBear via a debug_assert_eq! on
MODULUS_U64 and goldilocks does not exercise shard_ram (per
integration.yml commented-out lines and CLAUDE.md).

Risks and Rollout

Soundness. Closes ShardRam read/write direction is not bound to the public shard_rw_sum #1338. The new constraint only adds local byte
arithmetic and existing lookups — no change to transcript, sumcheck,
PCS, or EC accumulation. Recursive and native verifiers move in
lockstep (the has_ecc_ops row-factor fix lands in both).
GPU. The device-resident GPU shortcut now derives the y6_lo
lookup multiplicity host-side from the merged partitioned device
buffer (single D2H of ~26 u32 × records). Layout assumption is
guarded by debug_assert_eq!(record_u32s, 26) against
ceno_gpu::GpuShardRamRecord. CPU + GPU paths converge on the same
combined_lk_mlt contribution; runtime verification with
CENO_GPU_ENABLE_WITGEN=1 --features gpu on a CUDA host is
recommended before tag.
Recursion. The recursive verifier mirrors the native verifier's
has_ecc_ops × 2 row adjustment; no separate constraint-system
change is needed for the y-sign binding itself.
Field support. Hardcodes the BabyBear constant M = 60. A
debug_assert_eq!(MODULUS_U64, 0x78000001, ...) guards against
accidental use on a different field; shard_ram is BabyBear-only
today per CLAUDE.md.

Follow-ups

The remaining finish the TODOs in shard_ram.rs #1340 TODOs (local read ⇄ global write pairing on
shard_ram.rs:235-236, shard == shard_id binding on line 244) are
intentionally out of scope here.

Fixes #1338.
Partially addresses #1340.

Copilot Reviewer Directive (keep this section)

When Copilot reviews this PR, apply .github/copilot-instructions.md strictly.

Copilot

Pull request overview

Findings (sorted by severity)

Blocker | ceno_zkvm/src/tables/shard_ram.rs (new assert_byte / lookup_ltu_byte constraints): LK multiplicity aggregation/order looks inconsistent with new lookups.
This PR introduces new LK interactions in ShardRamConfig::configure (assert_byte + lookup_ltu_byte). However, the global LK multiplicities used to assign the DynamicRange/LTU table circuits are finalized and those table circuits are assigned before ShardRamCircuit is assigned in the shard pipeline (see ceno_zkvm/src/e2e.rs:1500-1589). As written, ShardRam’s new lookup usage does not appear to contribute to combined_lk_mlt prior to table-circuit assignment, which is expected to break the logup multiset check (or otherwise leave these lookups unaccounted).
Suggested fix: update the witness/LK aggregation flow so ShardRam’s byte/LTU lookups contribute to the global multiplicity before Rv32imConfig::assign_table_circuit runs (e.g., collect a per-chip multiplicity for ShardRam and include it in ZKVMWitnesses.lk_mlts prior to finalize_lk_multiplicities, or reorder assignment so ShardRam is assigned before lookup-table circuits).
Major | ShardRamRecord::to_ec_point: half-of-field boundary is off by one vs the new convention.
is_y_in_2nd_half currently uses y6 >= prime/2. For odd primes, prime/2 == (p-1)/2, so the boundary value y6 == (p-1)/2 is classified as “second half”, causing the new convention (“read => [1,(p-1)/2]”) to be violated and potentially making otherwise-valid witnesses fail the new in-circuit banding.
Suggested fix: compare against (prime + 1)/2 (or use a strict y6 > prime/2) to match the stated ranges.
Major | BabyBear-only guard uses debug_assert_eq! in circuit configuration.
The constraint relies on BabyBear’s (p-1)/2 = 60·2^24, but debug_assert_eq! is compiled out in release. If instantiated over a different field, the circuit would silently become incorrect.
Suggested fix: enforce at runtime (e.g., assert_eq! or return Err(CircuitBuilderError::CircuitError(..))).
Minor | Comment accuracy in to_ec_point.
The “2-torsion case where (x,y)==(x,-y)” phrasing is misleading: y6 == 0 doesn’t imply the full y-coordinate is zero; it only means that limb is fixed under negation, which is what makes the chosen encoding ambiguous/unsatisfiable.
Suggested fix: reword the comment to reflect the actual reason for rejection.
Minor (testing) | New test does not assert constraint/prover rejection.
test_shard_ram_y_sign_circuit_rejects_negation currently checks derived limb properties (b3 < 60 vs >= 60) but doesn’t actually run a constraint satisfiability check / mock prover / proof attempt that must fail for the tampered row. This can pass even if the lookup constraint is missing or if LK-table population is broken.
Suggested fix: make it a true regression by asserting the tampered witness fails constraint satisfaction (e.g., via MockProver with the necessary public inputs / table chips, or by attempting proof generation and asserting it errors).

Open questions / assumptions

Is ShardRam always assigned after lookup-table circuits in all proving entrypoints (CPU + GPU + recursion pipelines)? If yes, the LK multiplicity/order blocker needs a design-level fix (not just a local change).
Is it acceptable to hard-fail (non-debug) when BaseField != BabyBear, or is there a preferred feature-gate pattern for BabyBear-only chips in this repo?

Changes:

Host-side to_ec_point now rejects y6 == 0 and documents the read/write y-half convention.
Circuit-side: adds byte-decomposition + lookup constraints and a conditional equality binding y6 to is_global_write.
Adds a targeted unit test around the y-sign binding logic for honest vs sign-flipped points.

Copilot

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 5 comments.

## Problem `ShardRamCircuit` differentiates a global *read* from a global *write* by writing one of (x, y) or (x, -y) into the witness. Before this fix nothing constrained which y was chosen, so an attacker could flip is_global_write and migrate a record between the read/write sets without changing anything else in the witness. The y-sign was the entire signal — a soundness break. ## Design Rationale Bind the sign of `y6 = y[SEPTIC_EXTENSION_DEGREE - 1]` to is_global_write via a half-of-field convention: - read (is_global_write = 0): y6 in [1, (p-1)/2] - write (is_global_write = 1): y6 in [(p+1)/2, p-1] For BabyBear `(p-1)/2 = 60 * 2^24` exactly, so a witnessed `y6_lo in [0, (p-1)/2)` decomposes into four bytes with top byte `b3 < 60`. Three U8 `assert_byte` queries plus one `lookup_ltu_byte(b3, 60, 1)` bound y6_lo, then a single `condition_require_equal` ties y6 to either `y6_lo + 1` (read) or `(p-1) - y6_lo` (write) under the is_global_write selector. y6 = 0 is the unique fixed point not covered by either branch; `to_ec_point` skips it so the prover doesn't generate an unprovable record. Mirror the partition on the prover side: `to_ec_point` uses `y6 > prime / 2` (strict; `(p-1)/2` belongs to the read region) to decide whether to negate the natural sqrt, and bumps the nonce when y6 = 0. ## Change Highlights ### `ceno_zkvm/src/tables/shard_ram.rs` — chip-level y-sign binding - `ShardRamConfig`: add `y6_lo_bytes: [WitIn; 4]`; in `configure` emit 3 x `assert_byte` + 1 x `lookup_ltu_byte(_, 60, 1)` and one `condition_require_equal` tying y6 to is_global_write under the is_global_write selector. - `to_ec_point`: skip the `y6 = 0` case; classify `y6 > prime / 2` (strict, so the boundary `(p-1)/2` stays read) to decide whether to negate the natural sqrt. - `assign_instance`: write the four `y6_lo` byte limbs via the new `y6_lo_value` helper. mlt is surfaced via the new `assign_instances_with_lk_multiplicities` entry below — no per-row push left dangling. ### Lookup-multiplicity plumbing for ShardRam ShardRam's per-row y6_lo byte / LTU lookups must reach `combined_lk_mlt` so the U8 / LTU table `mlt` columns balance. ShardRam runs after opcode + dummy circuits, before `finalize_lk_multiplicities`. To surface mlt without burdening every other table circuit: - `ceno_zkvm/src/tables/mod.rs`: `TableCircuit` trait gains a second default-unimplemented method `assign_instances_with_lk_multiplicities` alongside the existing `assign_instances`. ShardRam overrides the former; every other table keeps overriding the latter. - `ceno_zkvm/src/structs.rs`: `ZKVMWitnesses::assign_shared_circuit` threads a `LkMultiplicity::default()` through ShardRam's parallel-chunk witgen and inserts `lk_multiplicity.into_finalize_result()` into `lk_mlts["ShardRamCircuit"]` before finalize. Asserts swap from `combined_lk_mlt.is_some()` to `is_none()` to lock the ordering. `assign_table_circuit` tolerates `combined_lk_mlt = None` by passing an empty multiplicity slice, so LocalFinalCircuit (which ignores the argument anyway) can also run before finalize. - `ceno_zkvm/src/e2e.rs`: move `MmuConfig::assign_continuation_circuit` (LocalFinal + ShardRam) to just before `finalize_lk_multiplicities`. Mirror the move inside the GPU debug-compare block so `combined_lk_mlt` diff stays meaningful. - `ceno_zkvm/src/instructions/riscv/rv32im/mmu.rs`: docstring updated to describe the new ordering invariant. ### Device-resident GPU shortcut for ShardRam (mlt mirror) `ZKVMWitnesses::try_assign_shared_circuit_gpu` dispatches into `instructions::gpu::chips::shard_ram::try_gpu_assign_shared_circuit` to keep the continuation EC computation device-resident (`gpu_batch_continuation_ec_on_device` + `merge_and_partition_records`) when `is_gpu_witgen_enabled()`. The GPU kernels never enter the CPU `assign_instance` per-row push, so the y6_lo lookup multiplicity is derived host-side: - After step 6 of `try_gpu_assign_shared_circuit` (merge+partition), D2H `partitioned_buf` once to `Vec<u32>` and walk it with stride `record_u32s = 26` (`GpuShardRamRecord` `#[repr(C)]` layout). Per record extract `is_to_write_set` (u32 offset 10) and `point_y[6]` (u32 offset 25), compute `y6_lo`, push the same 4 lookup queries the CPU path emits per row, then `into_finalize_result()` and return alongside the chunked `Vec<ChipInput<E>>`. `debug_assert_eq!(record_u32s, 26)` guards against `ceno_gpu` layout drift. - `try_assign_shared_circuit_gpu` inserts both `ChipInput` and the derived multiplicity into `self.witnesses` / `self.lk_mlts["ShardRamCircuit"]` so finalize folds the GPU-path contribution into `combined_lk_mlt` the same way the CPU shortcut does. ### Verifier: account for `has_ecc_ops` row doubling `ShardRamCircuit::has_ecc_ops()` adds an extra hypercube variable; the chip matrix has `2 * next_pow2(num_instance)` rows where the back half is EC-tree internal nodes with `selector_zero = 0`. Before this fix the chip had `num_lks = 0`, so the verifier's `dummy_table_item_multiplicity` correction never had to consider it. With the new byte/LTU queries the correction under-counted dummy lookups by a factor of 2 and shard verification failed with `logup_sum != 0`. - `ceno_zkvm/src/scheme/verifier.rs`: multiply `next_pow2_instance` by 2 when `circuit_vk.get_cs().has_ecc_ops()`. - `ceno_recursion/src/zkvm_verifier/verifier.rs`: mirror the same adjustment in the recursive verifier (lockstep per CLAUDE.md). ### Tests - `tables::shard_ram::tests::test_shard_ram_y_sign_circuit_rejects_negation` drives `assign_instances_with_lk_multiplicities` + `MockProver`. The honest row satisfies every constraint; the tampered row (same record, negated EC point) trips `lookup_Ltu` on the wrong-sign b3. A concrete challenge is supplied so the no-challenge `run` path doesn't drop `structural_witin`. - `test_shard_ram_circuit` updated to call `assign_instances_with_lk_multiplicities`. ## Testing ``` cargo fmt --all --check cargo make clippy # -D warnings, dev profile cargo clippy --workspace --all-targets --release cargo test --workspace --lib --release cargo run --release -p ceno_zkvm --features sanity-check --bin e2e -- \ --platform=ceno --max-cycle-per-shard=20000 \ --hints=10 --public-io=4191 \ examples/target/riscv32im-ceno-zkvm-elf/release/examples/fibonacci ``` End-to-end fibonacci across 6 shards verifies `ShardRamCircuit` and `LocalRAMTableFinal` on every shard with `exit code 0. Success.` GPU shortcut (`--features gpu` + `CENO_GPU_ENABLE_WITGEN=1`) needs a CUDA host to verify at runtime; static structure mirrors the CPU shortcut and CPU path remains identical. ## Risks and Rollout - Soundness boundary moved: the chip now constrains the EC y-sign that was previously unconstrained. Mirrored on native and recursive verifiers; protocol/transcript order is unchanged so the two stay in lockstep. - The `has_ecc_ops` row-factor verifier fix only manifests once any `has_ecc_ops` chip has `num_lks > 0`. ShardRam is the only such chip today; lookup balance failures elsewhere would be unrelated. - GPU mlt offsets are read from `shard_ram_record_to_gpu` (offsets 10 and 25 in 26 u32s). `debug_assert_eq!(record_u32s, 26)` trips if `ceno_gpu` reshuffles `GpuShardRamRecord` so silent drift is caught. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.

+        _multiplicity: &[FxHashMap<u64, usize>],
+        _input: &Self::WitnessInput<'_>,
+    ) -> Result<RMMCollections<E::BaseField>, ZKVMError> {
+        unimplemented!("assign_instances is not implemented for this table circuit")
+    }


        let witness = TC::assign_instances(
            config,
            cs.zkvm_v1_css.num_witin as usize,
            cs.zkvm_v1_css.num_structural_witin as usize,
-            self.combined_lk_mlt.as_ref().unwrap(),
+            self.combined_lk_mlt.as_ref().unwrap_or(&empty_mlt),


Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

kunxian-xia · 2026-05-26T03:59:22Z

+        // `lookup_ltu_byte(a, b, 1)` asserts `a, b` are bytes and `a < b`.
+        cb.lookup_ltu_byte(
+            y6_lo_bytes[3].expr(),
+            E::BaseField::from_canonical_u64(60).expr(),


@copilot please make this 60 as a constant s.t. we can refers to it here and in witness generation.

Done in d2c12a4. I introduced Y6_LO_TOP_BYTE_LT_BOUND and replaced the duplicated 60 in the circuit-side check plus both CPU/GPU witness-generation lookup multiplicity paths so they all reference the same constant.

Agent-Logs-Url: https://github.com/scroll-tech/ceno/sessions/401f145a-d351-4dd2-82b7-e882d683a012 Co-authored-by: kunxian-xia <1082586+kunxian-xia@users.noreply.github.com>

Copilot · 2026-05-26T04:22:19Z

Just as a heads up, I was blocked by some firewall rules while working on your feedback. Expand below for details.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

https://api.github.com/repos/scroll-tech/ceno-patch/commits/main
- Triggering command: /home/REDACTED/.rustup/toolchains/nightly-2025-11-20-x86_64-REDACTED-linux-gnu/bin/cargo /home/REDACTED/.rustup/toolchains/nightly-2025-11-20-x86_64-REDACTED-linux-gnu/bin/cargo metadata --format-version 1 --all-features --manifest-path /home/REDACTED/work/ceno/ceno/ceno_cli/example/Cargo.toml --filter-platform x86_64-REDACTED-linux-gnu --lockfile-path /tmp/rust-analyzer3638e-0/Cargo.lock -Zunstable-options scv_�� in.so /lto-wrapper f/parse-size-1.1.0/src/lib.rs scv_stats-77399cgit scv_stats-77399cpush scv_stats-77399c-v 20-x86_64-unknoworigin (http block)
- Triggering command: /home/REDACTED/.rustup/toolchains/nightly-2025-11-20-x86_64-REDACTED-linux-gnu/bin/cargo /home/REDACTED/.rustup/toolchains/nightly-2025-11-20-x86_64-REDACTED-linux-gnu/bin/cargo check --quiet --workspace --message-format=json --manifest-path /home/REDACTED/work/ceno/ceno/ceno_cli/example/Cargo.toml --target-dir /tmp/codeql-scratch-2d4a8dd44548aa29/dbs/rust/working/target --lockfile-path /tmp/rust-analyzer3638e-2/Cargo.lock --all-features --keep-going --compile-time-deps --all-targets -Zunstable-options -cgu.1.rcgu.o.1c/opt/hostedtoolcache/CodeQL/2.25.4/x64/codeql/rust/tools/autobuild.sh -incompat 0b12adedc1-cgu.0.rcgu.o.1clq9tr./tmp/rustcpEmwhq/symbols.o 43b58db621-cgu.0.rcgu.o.1clq9tr./home/REDACTED/work/ceno/ceno/target/debug/build/ahash-821ee3fbe46--print (http block)

If you need me to access, download, or install something from one of these locations, you can either:

Configure Actions setup steps to set up my environment, which run before the firewall is enabled
Add the appropriate URLs or hosts to the custom allowlist in this repository's Copilot coding agent settings (admins only)

spherel requested review from Copilot and hero78119 May 25, 2026 01:23

Copilot started reviewing on behalf of spherel May 25, 2026 01:23 View session

Copilot AI reviewed May 25, 2026

View reviewed changes

Comment thread ceno_zkvm/src/tables/shard_ram.rs Outdated

Comment thread ceno_zkvm/src/tables/shard_ram.rs Outdated

Comment thread ceno_zkvm/src/tables/shard_ram.rs Outdated

Comment thread ceno_zkvm/src/tables/shard_ram.rs

Comment thread ceno_zkvm/src/tables/shard_ram.rs Outdated

kunxian-xia reviewed May 25, 2026

View reviewed changes

Comment thread ceno_zkvm/src/tables/shard_ram.rs

spherel marked this pull request as draft May 25, 2026 13:39

dreamATD force-pushed the fix/issue-1338-shard-ram-y-sign branch from 15c96f8 to c666a53 Compare May 25, 2026 22:33

spherel requested a review from Copilot May 25, 2026 22:40

Copilot started reviewing on behalf of spherel May 25, 2026 22:40 View session

Copilot AI reviewed May 25, 2026

View reviewed changes

Comment thread ceno_zkvm/src/tables/shard_ram.rs

Comment thread ceno_zkvm/src/tables/mod.rs

Comment thread ceno_zkvm/src/structs.rs

Comment thread ceno_zkvm/src/instructions/gpu/chips/shard_ram.rs Outdated

Comment thread ceno_zkvm/src/tables/shard_ram.rs

spherel force-pushed the fix/issue-1338-shard-ram-y-sign branch from c666a53 to fe8cd5e Compare May 25, 2026 22:54

spherel force-pushed the fix/issue-1338-shard-ram-y-sign branch from 6e52439 to a4879ab Compare May 25, 2026 23:21

spherel marked this pull request as ready for review May 25, 2026 23:44

spherel requested a review from Copilot May 25, 2026 23:44

Copilot started reviewing on behalf of spherel May 25, 2026 23:45 View session

Copilot AI reviewed May 25, 2026

View reviewed changes

docs(shard_ram): clarify y6 == 0 rejection comment

8b3e770

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

kunxian-xia reviewed May 26, 2026

View reviewed changes

Copilot started work on behalf of kunxian-xia May 26, 2026 03:59 View session

refactor(shard_ram): hoist y6 top-byte LTU bound into shared constant

d2c12a4

Agent-Logs-Url: https://github.com/scroll-tech/ceno/sessions/401f145a-d351-4dd2-82b7-e882d683a012 Co-authored-by: kunxian-xia <1082586+kunxian-xia@users.noreply.github.com>

Copilot finished work on behalf of kunxian-xia May 26, 2026 04:22

Copilot AI requested a review from kunxian-xia May 26, 2026 04:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(soundness): bind ShardRam y-sign to is_global_write#1344

fix(soundness): bind ShardRam y-sign to is_global_write#1344
spherel wants to merge 3 commits into
masterfrom
fix/issue-1338-shard-ram-y-sign

spherel commented May 25, 2026 •

edited by kunxian-xia

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

kunxian-xia May 26, 2026

Uh oh!

Copilot AI May 26, 2026

Uh oh!

Copilot AI commented May 26, 2026

I tried to connect to the following addresses, but was blocked by firewall rules:

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

spherel commented May 25, 2026 • edited by kunxian-xia Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Design Rationale

Change Highlights

ceno_zkvm/src/tables/shard_ram.rs — chip-level y-sign binding

Lookup-multiplicity plumbing for ShardRam

Device-resident GPU shortcut for ShardRam (mlt mirror)

Verifier: account for has_ecc_ops row doubling

Benchmark / Performance Impact

Testing

Risks and Rollout

Follow-ups

Copilot Reviewer Directive (keep this section)

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Findings (sorted by severity)

Open questions / assumptions

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

kunxian-xia May 26, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI May 26, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI commented May 26, 2026

I tried to connect to the following addresses, but was blocked by firewall rules:

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

spherel commented May 25, 2026 •

edited by kunxian-xia

Loading

`ceno_zkvm/src/tables/shard_ram.rs` — chip-level y-sign binding

Verifier: account for `has_ecc_ops` row doubling