Skip to content

Performance improvements#1173

Open
giltho wants to merge 3 commits into
AeneasVerif:mainfrom
soteria-tools:worktree-perf
Open

Performance improvements#1173
giltho wants to merge 3 commits into
AeneasVerif:mainfrom
soteria-tools:worktree-perf

Conversation

@giltho
Copy link
Copy Markdown
Contributor

@giltho giltho commented May 25, 2026

I spent some tokens on trying to get performance improvements on Charon. Claude found 3 optimisations, in order of effectiveness:

  • Enable parallelism in Rustc (half of the available cores, and up to 8.
  • Use FxHashMap instead of HashMap for interning
  • Lazy evaluation of constants (only when needed)

I'm not 100% sure the parallelism by default is desirable, worst case it can probably be set by Soteria itself.

Here's a list of all the changes made and their impact on the benchmark:

# Change File(s) Wall-clock impact (SpQR, preset=fast, postcard) Commit
1 Swap hash-cons interner's IndexMap to FxBuildHasher (was default SipHash) — interner is hit on every type translation charon/src/ast/hash_cons.rs -7.7% (7.811 → 7.27 s, -541 ms) 4c0a9eb9
2 Swap TypeMap's HashMap<TypeId, _> to FxHashMapTypeId already has full entropy, SipHash is pure overhead charon/src/common.rs -0.7% incremental (7.27 → 7.21 s, -60 ms) 4c0a9eb9
3 Alias HashMap/HashSet to FxHashMap/FxHashSet in the translation context (covers id_map, reverse_id_map, cached_names, processed, file_to_id, etc.) charon/src/bin/charon-driver/translate/translate_ctx.rs -1.0% incremental (7.21 → 7.13 s, -80 ms) 4c0a9eb9
4 Defer const-eval for FullDef::{Const,AssocConst}.value — eagerly building it ran rustc's const-interpreter for every const item, but the field was only consumed in one fallback path (MIR missing). Store None, evaluate on demand. charon/src/bin/charon-driver/hax/types/new/full_def.rs, charon/src/bin/charon-driver/translate/translate_bodies.rs -0.7% incremental (7.13 → 7.08 s, -50 ms) 65e8f6b2
5 Enable rustc's parallel front-end (-Z threads) at translation time, set to (available_cores / 2).clamp(1, 8). Parallelises par_hir_body_owners (typeck + const-eval — the dominant ~60% cost) charon/src/bin/charon-driver/driver.rs -20.4% incremental (7.08 → 5.65 s, -1.43 s) 028959fa
Total 5 files, +37/-4 lines -28.0% (7.86 → 5.65 s, -2.20 s, 1.39× faster)

Of course, I have reviewed this myself and nothing jumps out as insane to me

giltho added 3 commits May 25, 2026 11:27
Signed-off-by: Sacha Ayoun <sachaayoun@gmail.com>
Building `FullDefKind::Const` and `FullDefKind::AssocConst` was eagerly
running `const_value()` -> `eval_ty_constant()` -> rustc's const interpreter
for every const item. Profiling on SparsePostQuantumRatchet showed the
field's only consumer is `translate_def_body_inner`, and only on the
MIR-missing fallback path. Make it lazy: store `None`, and let the body
translator call `const_value` on demand.

Charon was duplicating eval work rustc's own `check_crate` does anyway,
so the net wall-clock win is modest (~50ms on SpQR) but the change is
strictly cleaner and removes one `sinto` per const item.
The dominant cost in charon on real crates (~60% of wall time on
SparsePostQuantumRatchet) is rustc's `check_crate` doing typeck and
const-eval, which `par_hir_body_owners` already farms out across
threads when `-Z threads` is > 1. Default `threads` to half the
available cores, capped at 8 (returns plateau there in our
measurements).

Effect on SparsePostQuantumRatchet (preset=fast, postcard output) on a
16-core M-series chip: 7.08s -> 5.61s (~20% faster).

We only set this when we're actually doing translation, so dependency
crates compiled through our `RUSTC_WRAPPER` shim still use the cargo
default.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant