Conversation
|
Claude finished @mrain's task in 4m 34s —— View job Review of PR #4223 — VID Performance Improvements
SummarySolid performance work — the optimizations are well-motivated and the numbers in the PR description are impressive (4.8x disperse, 2.8x verify, 8.7x recover). The code changes are clean and correct. I left inline comments on specific files; here's the high-level: Issues to address
Minor / informational
What looks good
|
There was a problem hiding this comment.
Code Review
This pull request optimizes the avidm_gf2 scheme by switching the hashing algorithm to blake3 and implementing several memory and concurrency improvements, such as avoiding intermediate allocations during padding and parallelizing dispersal, verification, and recovery using rayon. Feedback identifies security risks associated with downgrading blake3 and constant_time_eq dependencies in Cargo.lock. Additionally, there is a suggestion to parallelize the sequential payload extraction in the disperse function to further improve performance.
…ms/espresso-network into cl/vidimprove
vid/Cargo.toml declared `[[bench]]` entries for both files but they were
not committed, breaking `cargo bench -p vid` and CI.
- avidm_gf2_ns.rs: namespaced disperse/verify/recover sweep across
num_ns ∈ {1, 10, 50, 100} matching the avidm_gf2 flat parameters.
- avidm_gf2_breakdown.rs: per-phase timing of disperse/recover, hash
parameterized so a single run reports BLAKE3 vs Keccak side-by-side.
Replace `jf_merkle_tree::hasher::HasherDigest` (which requires
`digest::Digest`) with a local `Blake3DigestAlgorithm` + `Blake3Node`
that call `blake3::hash` / `blake3::Hasher` directly. This unblocks
upgrading `blake3` past 1.5.x — newer releases bumped to `digest 0.11`
which is incompatible with `jf_merkle_tree`'s `digest 0.10` blanket
impl, forcing a workspace-wide pin on `~1.5` and dragging
`constant_time_eq` back to 0.3.x.
After this:
- Workspace pin lifted from `blake3 = "~1.5"` to `blake3 = "1"`,
resolving to 1.8.5 (constant_time_eq 0.4.2). Same blake3 algorithm,
no on-wire change.
- `vid` no longer enables `blake3/traits-preview`, an explicitly-
unstable feature whose `digest` version is the source of the pin.
- Wire format unchanged: 32-byte node values, leaf/internal domain
separators `b"1"`/`b"0"`, identical to
`HasherDigestAlgorithm + HasherNode<blake3::Hasher>`. Reference
test `test_vid_common_v2_query_data` still matches the data added
in 5cf2687 — confirmed by `cargo test -p espresso-types reference`.
- `avidm_gf2_breakdown` bench abstracted over a small `MerkleBackend`
trait so both Keccak (via `HasherDigest`) and BLAKE3 (via
`Blake3DigestAlgorithm`) plug in without code duplication.
- `vid::utils` is now `pub` so benches can reach the BLAKE3 helpers.
Addresses Cargo.lock blake3/constant_time_eq downgrade comments and
the `traits-preview` instability concern from the PR review.
Co-authored-by: Copilot <copilot@github.com>
VID — original vs current, ST + MT, 1/10/100 ns, local + AWSSnapshot of the cumulative impact of this PR. Workload: arity 4 (current) vs arity 3 (original), 10 MB payload, Local (Apple Silicon)
AWS Graviton (aarch64, 1 vCPU — MT ≈ ST)
Notes
|
|
@claude review |
|
Claude finished @mrain's task in 3m 25s —— View job Review of PR #4223 — VID Performance Improvements
SummaryThe PR is in significantly better shape than the first review. The blake3 version pin issue is resolved (now at 1.8.5 with Resolved from prior review
Items to address1.
2. Domain separator fragility in
Minor observations3. Arity change from 2 to 4 is not documented in the version table The Merkle tree arity changed from 2 (via 4.
if let Some(idx) = recovered.iter().rposition(|&b| b != 0)
&& recovered[idx] == 1u8
{
recovered.truncate(idx);
}This silently succeeds on malformed payloads (no pad byte found), unlike the production 5.
What looks good
VerdictThe core optimizations are correct and well-tested. The main concern is the unpublished |
Nextest failures (2) in this run
See the step summary for flaky tests and slowest tests. |
…etwork into cl/vidimprove
Nextest failures (1) in this run
See the step summary for flaky tests and slowest tests. |
Closes #<ISSUE_NUMBER>
This PR:
VID performance improvements: see #4223 (comment)
extend_from_sliceinstead offlat_map().collect().Vec<Option<&[u8]>>refs + direct decoder→output memcpy, no.clone()/.to_vec(), nounwrap().chunk_and_pad, no intermediate padded buffer..to_vec().par_iteracross namespaces inns_disperse,recover, andverify_share_with_verified_common; innerpar_iterwithfind_anyshort-circuit inverify_share.This PR does not:
Key places to review: