Skip to content

ci(rust): fix cache thrashing that forced cold compiles every run#1201

Open
gaspb wants to merge 2 commits into
mainfrom
fix/rust-ci-cache-thrash
Open

ci(rust): fix cache thrashing that forced cold compiles every run#1201
gaspb wants to merge 2 commits into
mainfrom
fix/rust-ci-cache-thrash

Conversation

@gaspb

@gaspb gaspb commented Jul 2, 2026

Copy link
Copy Markdown
Member

Attempt reducing cache miss / eviction & improving build time

gaspb and others added 2 commits July 2, 2026 22:00
The Rust CI critical path (Test job) was ~17 min, ~80% of it recompiling
dependencies from scratch. Root cause: sccache (GHA backend) and
Swatinem/rust-cache shared the single 10 GB per-repo Actions cache and
evicted each other. Even main-push runs reported "No cache found", sccache
was failing >50% of its writes (445/816 in Test), and it gave ~0% hit rate
on Clippy, which it cannot cache at all.

- Drop sccache everywhere; rely on rust-cache alone so the whole 10 GB
  budget serves the dependency cache.
- Only save the cache from main (save-if) so PR/dependabot branches restore
  main's warm cache instead of evicting it.
- Trim CI debug info to line-tables-only (dev+test, CI env only): faster
  codegen/linking and a much smaller target to cache. Local dev is unchanged.

Free disk space step kept intentionally (past disk-pressure issues).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_0176TaTFYYnLPUbQ6zthn7hD
Image builds swung 24-41 min because the cargo-chef `cook` layer (all deps,
~12 min in release) kept getting evicted from the shared 10 GB GHA cache:
8 parallel images each doing cache-to=gha,mode=max blow the budget and evict
each other. Proof from one main run: meteroid-api's cook layer MISSED (11m49s
rebuild, 26 min total) while the identical meteroid-scheduler cook layer was
CACHED (9m53s total).

- Move layer cache to the registry (GHCR), one ref per image: no 10 GB cap so
  the cook layer stays warm. Only main writes it, so branches can't evict it.
- Wire mold as the linker: it was installed in every image but unused because
  the .cargo/config.toml mold config is .dockerignored. Set
  RUSTFLAGS=-Clink-arg=-fuse-ld=mold (verified locally: readelf shows the mold
  stamp in the linked binary) so release links stop using the slow default ld.

Expected: meteroid-api (critical path) ~26 min cold -> ~10 min warm, roughly
halving the workflow.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_0176TaTFYYnLPUbQ6zthn7hD
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant