Introduce x-str-minimum and x-str-maximum validation keywords for numeric-formatted strings

Currently, materialization connectors utilize the `minimum` and `maximum` JSON Schema attributes to determine the optimal underlying data types for fields mapped as `type: integer` and `type: number`. This works well for native numeric fields. 

However, for fields mapped as `type: string` with `format: integer` or `format: number`, we currently lack equivalent boundary attributes. Without min/max ranges for these fields, materialization connectors cannot make informed decisions about the best database types to use for numbers that are presented as strings.

## Problem
The initial idea was to apply the existing `minimum` and `maximum` JSON Schema attributes directly to numeric-formatted string fields. However, because these are strict validation keywords (not just annotations) within JSON Schema, reusing them for strings introduces two major issues:

1. **Loss of precision (Lossiness):** Reusing the same keyword mixes the bounds of native numbers and strings. A string representation's bound might exceed a 32-bit or 64-bit limit while the native numeric bound does not, making it difficult to differentiate and act upon.
2. **Breaking backfills:** Validation is critical to the end-to-end schema inference process. If we started enforcing standard `maximum` validation against strings, established collections with existing strings larger than the inferred `maximum` would permanently break during materialization backfills. Connectors rely on validation failures to safely halt, restart, and apply required DDL changes *before* processing widened documents.

## Proposed Solution
Introduce two new custom validation keywords specifically for string fields:
* `x-str-minimum`
* `x-str-maximum`

## Implementation Details / Requirements

* **Schema Inference Update:** Schema inference should begin populating `x-str-minimum` and `x-str-maximum` for *new* locations where `type: string` and `format: integer/number` are detected. 
* **Backward Compatibility:** These new bounds should *not* be added to established string locations that do not already have them (matching our existing behavior for standard `minimum`/`maximum` to prevent breaking existing streams).
* **Validation Enforcement:** These **must** be implemented as strictly enforced validations in the validation crate, not merely as data-plane annotations. 
* **Connector Guarantee:** This enforcement guarantees that a connector will never receive a document that violates an `x-str-maximum` or `x-str-minimum` setting it was provided at startup, giving it the opportunity to halt and apply DDL updates if the schema widens.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce x-str-minimum and x-str-maximum validation keywords for numeric-formatted strings #2895

Problem

Proposed Solution

Implementation Details / Requirements

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Introduce x-str-minimum and x-str-maximum validation keywords for numeric-formatted strings #2895

Description

Problem

Proposed Solution

Implementation Details / Requirements

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions