Skip to content

feat(artifact): add wait_for_upload_headroom producer-side backpressure hook#2721

Open
Farhad-Shabani wants to merge 1 commit into
mainfrom
farhad/redis-artifact-backpressure
Open

feat(artifact): add wait_for_upload_headroom producer-side backpressure hook#2721
Farhad-Shabani wants to merge 1 commit into
mainfrom
farhad/redis-artifact-backpressure

Conversation

@Farhad-Shabani
Copy link
Copy Markdown
Contributor

@Farhad-Shabani Farhad-Shabani commented Apr 16, 2026

What

Adds one defaulted trait method to ArtifactClient and one call to it in create_core_proving_task, right before the trace-chunk upload().

Why

External provers running sp1-cluster with NODE_ARTIFACT_STORE=redis are hitting Redis OOM on large requests (recent repro: 278 B cycles / 266 B PGUs, ~8 k shards, 512 GB Redis)

Root cause

the local node's implicit backpressure (ProverSemaphore) is lost when sp1-cluster distributes producer and consumer across a gRPC boundary. SendSpliceWorker uploads a ~200 MB trace chunk to Redis, calls submit_task, the coordinator accepts the row in milliseconds and returns, the producer loops. Redis fills faster than GPUs drain it.

Fix

Restores backpressure by giving the artifact store a way to say "I'm full, slow down." The executor asks before every upload; the store answers based on its own state.

Companion PR: succinctlabs/sp1-cluster#84

@Farhad-Shabani Farhad-Shabani force-pushed the farhad/redis-artifact-backpressure branch from d1ef1ee to fc92022 Compare April 16, 2026 00:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant