Default to Balanced durability mode in cluster mode by AhmedSoliman · Pull Request #4885 · restatedev/restate

AhmedSoliman · 2026-06-04T09:55:16Z

When attempting to run a cluster without a snapshot repository we now default to using a balanced durability mode.
We decide whether we are in a cluster-mode or not based on the number of workers in nodes configuration as well as
if partition replication is > 1. This means that we will automatically disable trimming and WARN the user every
minute that they need to configure a snapshot store. This is safer than the previous default which would keep trimming
enabled (to save space and to reduce risk of historical nodeset decay) but would mean that partitions will fail to
move to other nodes due to absence of the snapshot.

Bonus: This takes a couple of small steps in the direction of fully removing the legacy "everywhere" replication from
partition table. A todo is left for a future proactive migration for existing clusters with old partition table metadata.

Fixes #4824

Stack created with Sapling. Best reviewed with ReviewStack.

AhmedSoliman · 2026-06-04T10:00:52Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 28f67e2693

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-04T10:05:16Z

+                partition_replication: Some(
+                    partition_table
+                        .replication_property(&Metadata::with_current(|m| m.nodes_config_ref()))
+                        .into(),


Preserve Everywhere on configuration reads

For clusters that still have legacy PartitionReplication::Everywhere metadata, this now returns a concrete replication property instead of omitting the field. If a client reads the cluster configuration and writes it back unchanged (for example while changing only bifrost_provider), update_cluster_configuration treats any Some value that is not already Limit(current) as a request to set Limit (crates/admin/src/cluster_controller/service.rs:699-701); previously the protobuf conversion encoded Everywhere as None (crates/types/src/protobuf.rs:134-140), which preserved the legacy dynamic behavior. That round-trip freezes replication to the current worker count, so later added workers are no longer automatically included as they were under Everywhere.

Useful? React with 👍 / 👎.

We really want to get rid of the legacy everywhere.

github-actions · 2026-06-04T10:26:57Z

Test Results

8 files ±0 8 suites ±0 4m 56s ⏱️ +7s
60 tests ±0 60 ✅ ±0 0 💤 ±0 0 ❌ ±0
267 runs ±0 267 ✅ ±0 0 💤 ±0 0 ❌ ±0

Results for commit 6b47208. ± Comparison against base commit 93b38cc.

♻️ This comment has been updated with latest results.

When attempting to run a cluster without a snapshot repository we now default to using a balanced durability mode. We decide whether we are in a cluster-mode or not based on the number of workers in nodes configuration as well as if partition replication is `> 1`. This means that we will automatically disable trimming and `WARN` the user every minute that they need to configure a snapshot store. This is safer than the previous default which would keep trimming enabled (to save space and to reduce risk of historical nodeset decay) but would mean that partitions will fail to move to other nodes due to absence of the snapshot. Bonus: This takes a couple of small steps in the direction of fully removing the legacy "everywhere" replication from partition table. A todo is left for a future proactive migration for existing clusters with old partition table metadata. Fixes #4824

tillrohrmann

Thanks for changing the default DurabilityMode from ReplicaSetOnly to Balanced @AhmedSoliman. LGTM. +1 for merging.

AhmedSoliman force-pushed the pr4885 branch 2 times, most recently from 4c9d612 to 28f67e2 Compare June 4, 2026 10:00

AhmedSoliman marked this pull request as ready for review June 4, 2026 10:01

chatgpt-codex-connector Bot reviewed Jun 4, 2026

View reviewed changes

AhmedSoliman requested a review from tillrohrmann June 4, 2026 10:39

AhmedSoliman mentioned this pull request Jun 4, 2026

Expand the usage of friendly durations #4887

Open

AhmedSoliman force-pushed the pr4885 branch from 28f67e2 to 6b47208 Compare June 4, 2026 13:02

AhmedSoliman requested a review from muhamadazmy June 4, 2026 13:03

tillrohrmann approved these changes Jun 4, 2026

View reviewed changes

AhmedSoliman merged commit 6b47208 into main Jun 4, 2026
56 of 57 checks passed

AhmedSoliman deleted the pr4885 branch June 4, 2026 13:48

github-actions Bot locked and limited conversation to collaborators Jun 4, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Default to Balanced durability mode in cluster mode#4885

Default to Balanced durability mode in cluster mode#4885
AhmedSoliman merged 1 commit into
mainfrom
pr4885

AhmedSoliman commented Jun 4, 2026 •

edited

Loading

Uh oh!

AhmedSoliman commented Jun 4, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jun 4, 2026

Uh oh!

AhmedSoliman Jun 4, 2026

Uh oh!

github-actions Bot commented Jun 4, 2026 •

edited

Loading

Uh oh!

tillrohrmann left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

AhmedSoliman commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AhmedSoliman commented Jun 4, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

AhmedSoliman Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Test Results

Uh oh!

tillrohrmann left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

AhmedSoliman commented Jun 4, 2026 •

edited

Loading

github-actions Bot commented Jun 4, 2026 •

edited

Loading