Problem
Dataflow reset on a collection removes the collection's journal splitting configuration. Journals that were previously split back to default (single journal per logical partition). For large, high-throughput captures that relied on splitting for parallelism, post-reset performance drops significantly until splits are re-applied.
Customers cannot split journals themselves — splitting is an Estuary-internal operation — so once splitting is lost on reset, throughput stays degraded until Estuary support intervenes to re-split.
Why it matters
- Large captures (multi-TB, multi-million-row) frequently need journal splitting to keep up with source throughput or to process a backfill within an acceptable window.
- Dataflow reset is commonly used precisely for these captures (e.g. to recover from schema drift, or to restart from the current source state rather than replaying weeks of change data).
- Post-reset, the backfill is often the single largest throughput event the pipeline will ever do — exactly when split parallelism matters most.
- There is a real risk scenario: if the source's replication slot (Postgres, MySQL, etc.) is sensitive to replication lag, the slow post-reset catch-up can cause the slot to fall behind or get purged, escalating to an outage.
Prior internal discussion
Proposed resolution directions
-
Preserve journal splits across reset. The split configuration is an operational concern separate from the collection data / inferred schema. It could be copied forward onto the new collection state as part of reset publication.
-
Auto-re-apply splits after reset based on a recorded target. If we can't carry the live split state forward, we could persist the split intent (e.g. "this collection should have N splits") on the spec/controller, and have the control plane re-apply it once the reset completes and the collection is receiving data again.
-
Expose journal splitting to customers. Out of scope for this issue, but would make this problem a self-service recovery rather than a support ticket. Tracked elsewhere if it isn't already.
Direction (1) is likely simplest if the splitting state is recorded in a place that survives the reset's journal turnover.
Cross-links
Labels
control-plane, enhance
Problem
Dataflow reset on a collection removes the collection's journal splitting configuration. Journals that were previously split back to default (single journal per logical partition). For large, high-throughput captures that relied on splitting for parallelism, post-reset performance drops significantly until splits are re-applied.
Customers cannot split journals themselves — splitting is an Estuary-internal operation — so once splitting is lost on reset, throughput stays degraded until Estuary support intervenes to re-split.
Why it matters
Prior internal discussion
require-field-validation issue (see #2880)Proposed resolution directions
Preserve journal splits across reset. The split configuration is an operational concern separate from the collection data / inferred schema. It could be copied forward onto the new collection state as part of reset publication.
Auto-re-apply splits after reset based on a recorded target. If we can't carry the live split state forward, we could persist the split intent (e.g. "this collection should have N splits") on the spec/controller, and have the control plane re-apply it once the reset completes and the collection is receiving data again.
Expose journal splitting to customers. Out of scope for this issue, but would make this problem a self-service recovery rather than a support ticket. Tracked elsewhere if it isn't already.
Direction (1) is likely simplest if the splitting state is recorded in a place that survives the reset's journal turnover.
Cross-links
Labels
control-plane,enhance