Dataflow reset drops journal splitting, painful for large captures

### Problem

Dataflow reset on a collection removes the collection's journal splitting configuration. Journals that were previously split back to default (single journal per logical partition). For large, high-throughput captures that relied on splitting for parallelism, post-reset performance drops significantly until splits are re-applied.

Customers cannot split journals themselves — splitting is an Estuary-internal operation — so once splitting is lost on reset, throughput stays degraded until Estuary support intervenes to re-split.

### Why it matters

- Large captures (multi-TB, multi-million-row) frequently need journal splitting to keep up with source throughput or to process a backfill within an acceptable window.
- Dataflow reset is commonly used precisely for these captures (e.g. to recover from schema drift, or to restart from the current source state rather than replaying weeks of change data).
- Post-reset, the backfill is often the single largest throughput event the pipeline will ever do — exactly when split parallelism matters most.
- There is a real risk scenario: if the source's replication slot (Postgres, MySQL, etc.) is sensitive to replication lag, the slow post-reset catch-up can cause the slot to fall behind or get purged, escalating to an outage.

### Prior internal discussion

- [Dataflow reset removes journal splitting, flagged as a "pile on" pain alongside the `require`-field-validation issue (see #2880)](https://estuaryworkspace.slack.com/archives/C03QBN83GQ4/p1756919667130159)
  > "collection reset also removes journal splitting which is a bit of a problem for very large captures. Ex. [customer] has one that the replication slot fell behind on over the weekend. It would have no matter what because of under-provisioning, but if that weren't the case, it would have caused a real issue... since customers can't split journals"

### Proposed resolution directions

1. **Preserve journal splits across reset.** The split configuration is an operational concern separate from the collection data / inferred schema. It could be copied forward onto the new collection state as part of reset publication.

2. **Auto-re-apply splits after reset based on a recorded target.** If we can't carry the live split state forward, we could persist the split *intent* (e.g. "this collection should have N splits") on the spec/controller, and have the control plane re-apply it once the reset completes and the collection is receiving data again.

3. **Expose journal splitting to customers.** Out of scope for this issue, but would make this problem a self-service recovery rather than a support ticket. Tracked elsewhere if it isn't already.

Direction (1) is likely simplest if the splitting state is recorded in a place that survives the reset's journal turnover.

### Cross-links

- #2880 — sibling dataflow-reset pain: reset blocks on required fields that only exist in the inferred schema. Same root workflow, different failure mode.
- #2821 — truncations umbrella.

### Labels

`control-plane`, `enhance`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dataflow reset drops journal splitting, painful for large captures #2881

Problem

Why it matters

Prior internal discussion

Proposed resolution directions

Cross-links

Labels

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Dataflow reset drops journal splitting, painful for large captures #2881

Description

Problem

Why it matters

Prior internal discussion

Proposed resolution directions

Cross-links

Labels

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions