Switch pod-scheduler configs from JSON to YAML; add JSON Schema by LoopedBard3 · Pull Request #2170 · aspnet/Benchmarks

LoopedBard3 · 2026-05-07T20:26:59Z

Why

Follow-up to #2167. The pod-scheduler configs introduced there are JSON, which makes them awkward to read and edit — there are no comments, lots of quote/comma noise, and type: 3 is a magic integer you have to grep models.py to understand. This is also rough for LLM-driven edits, which tend to revert to literal JSON and don't have anywhere to anchor docs.

This PR switches the configs to YAML, ships a JSON Schema, and makes scenario.type a self-documenting string.

What

Format

Move build/benchmarks_ci{,_azure,_cobalt}_pods.json → .yml. Each opens with:
```
# yaml-language-server: $schema=../scripts/pod-scheduler/pod-config.schema.json
```
so VS Code / Cursor / the Red Hat YAML extension provide autocomplete, hover docs, and inline validation while editing.
Inline comments now document things that used to live only in PR review threads — e.g., that gold-lin and gold-win share gold-db, or that the -28 cobalt-hosted pods reuse the same physical machines as their non-28 siblings.

Self-documenting `type`

type: 1|2|3 → type: single|dual|triple (case-insensitive). Integer form is still accepted so any local .json copies keep loading; bools (the YAML yes/no trap) and unknown strings raise ConfigError.

Schema

New scripts/pod-scheduler/pod-config.schema.json (Draft 2020-12). Schema descriptions double as grounding for LLM agents — they should hallucinate fewer keys and wrong type values.
Schema and configs validated locally with the jsonschema package.

Loader

config_loader.py dispatches on file extension: .yml/.yaml via PyYAML, .json via stdlib (back-compat), unknown extensions → ConfigError.

Generated pipelines

benchmarks-ci-01.yml, -02.yml, -azure.yml, -cobalt.yml regenerated. Only the embedded regen-command header line changed (.json → .yml); the snapshot test confirms scheduling output is byte-identical.

Tests

49 unit tests pass. 5 new cases:
- All three string aliases for type with case variations.
- Integer back-compat for type.
- Bool and unknown-string type rejection.
- .json back-compat happy path (default fixture writer is now YAML, so every existing test exercises the YAML path).
- Unknown-extension rejection.

Docs

Repo README.md, build/README.md, and scripts/pod-scheduler/README.md updated with YAML examples and a section on the schema directive.

Verification

cd scripts/pod-scheduler
python -m unittest discover tests
# Ran 49 tests in 0.344s — OK

The snapshot test enforces that regenerating from the new YAML configs produces byte-identical pipeline YAML, so this is a config/format refactor with no behavioral change.

Notes for reviewers

I kept the _format_source_path repo-relative URL behavior; it now expects .yml paths in the test, which matches what main.py is invoked with from CI.
_parse_scenario_type explicitly rejects bool because yaml.safe_load happily turns yes/no/true/false into Python bools, which would otherwise quietly resolve to 1/0 via the integer alias. Better to fail loudly.
Schema asserts additionalProperties: false on pods/scenarios so a misspelled key (e.g. mahcines: instead of machines:) is caught at edit-time by the YAML LSP.

The three pod configs in build/ were JSON, which made them hard to read and didn't allow inline comments. This converts them to YAML, ships a JSON Schema describing the shape, and changes scenario.type from a magic 1/2/3 integer to a self-documenting single/dual/triple string. * build/benchmarks_ci_pods.yml, benchmarks_ci_azure_pods.yml, and benchmarks_ci_cobalt_pods.yml replace their .json predecessors. Each opens with a yaml-language-server # yaml-language-server: schema=... directive so VS Code / Cursor / the YAML LSP provide autocomplete, hover docs, and inline validation while editing. * scripts/pod-scheduler/pod-config.schema.json (new) is the schema. The schema descriptions also serve as reliable grounding for LLM-driven edits. * config_loader.py now dispatches on file extension and accepts both YAML and JSON. scenario.type accepts single|dual|triple (case-insensitive) plus the legacy integer 1|2|3 for back-compat; bools (which YAML parsers happily produce from yes/no) are rejected explicitly. Unknown extensions and unknown type strings raise ConfigError so typos can't silently drop scenarios. * Generated benchmarks-ci-*.yml files only change the embedded regen command in the file header (.json -> .yml). Schedule data is byte-identical, verified by the existing snapshot test. * Tests: 5 new cases covering YAML loading, type-string aliases, type back-compat, bool/invalid rejection, and unknown-extension rejection. Default fixture writer now produces YAML so the YAML path is exercised on every test. * Docs: README.md, build/README.md, and scripts/pod-scheduler/README.md are updated with YAML examples and the schema-directive convention. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot

Pull request overview

This PR migrates the pod-scheduler’s source-of-truth configs in build/ from JSON to YAML to improve readability (comments) and editor support, and introduces a JSON Schema to validate/configure those YAML files while keeping JSON loading for backward compatibility.

Changes:

Converted the three build/benchmarks_ci_*_pods.json configs to .yml equivalents and updated references/docs accordingly.
Added scripts/pod-scheduler/pod-config.schema.json and adopted yaml-language-server $schema directives in shipped configs.
Updated the loader to dispatch by extension (YAML/JSON) and to accept scenario.type as single|dual|triple (case-insensitive) while still allowing legacy 1|2|3.

Reviewed changes

Copilot reviewed 22 out of 22 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
scripts/pod-scheduler/config_loader.py	Load YAML/JSON by extension; add scenario type aliasing and validation.
scripts/pod-scheduler/pod-config.schema.json	New JSON Schema describing the YAML/JSON config shape for LSP validation.
scripts/pod-scheduler/models.py	Update docs/comments to reflect YAML/JSON config sources.
scripts/pod-scheduler/generator.py	Update generated header regen command to reference `.yml`.
scripts/pod-scheduler/scheduler.py	Docstring wording updated from JSON-specific to config-generic.
scripts/pod-scheduler/main.py	CLI help/usage updated to YAML/JSON.
scripts/pod-scheduler/README.md	Docs updated with YAML examples and schema-driven editing guidance.
scripts/pod-scheduler/tests/test_config_loader.py	Add YAML-path tests + scenario.type alias/back-compat coverage.
scripts/pod-scheduler/tests/test_generator.py	Update expected filenames/paths from `.json` to `.yml`.
scripts/pod-scheduler/tests/test_snapshots.py	Snapshot inputs updated to new `.yml` configs.
build/benchmarks_ci_pods.yml	New YAML config replacing JSON for on-prem CI pods.
build/benchmarks_ci_azure_pods.yml	New YAML config replacing JSON for Azure CI pods.
build/benchmarks_ci_cobalt_pods.yml	New YAML config replacing JSON for Cobalt CI pods.
build/benchmarks_ci_pods.json	Removed legacy JSON config (replaced by YAML).
build/benchmarks_ci_azure_pods.json	Removed legacy JSON config (replaced by YAML).
build/benchmarks_ci_cobalt_pods.json	Removed legacy JSON config (replaced by YAML).
build/benchmarks-ci-01.yml	Regenerated header updated to `.yml` config path.
build/benchmarks-ci-02.yml	Regenerated header updated to `.yml` config path.
build/benchmarks-ci-azure.yml	Regenerated header updated to `.yml` config path.
build/benchmarks-ci-cobalt.yml	Regenerated header updated to `.yml` config path.
build/README.md	Docs updated to reference YAML configs and schema directive.
README.md	Top-level doc updated to say pipelines are generated from YAML configs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Three of the review comments on aspnet#2170 flagged the same class of issue around YAML/Python edge cases. Addressing them all here: * scripts/pod-scheduler/requirements.txt (new) declares the runtime PyYAML dependency that landed with the YAML migration. Without this, a clean checkout fails with ModuleNotFoundError. * scripts/pod-scheduler/README.md gains a Prerequisites subsection pointing at requirements.txt so contributors know what to install. * config_loader._reject_bool / _coerce_int / _coerce_float (new helpers) explicitly reject Python bools in numeric fields. Previously `timeout: yes` would slip through `int(timeout)` and resolve to 1 because bool is a subclass of int. The new helpers also enforce range minimums (target_yaml_count >= 1, schedule_offset_hours >= 0, timeout >= 1, estimated_runtime >= 0) so out-of-band values fail loudly at load time. * test_config_loader.py adds 6 cases covering bool rejection on each numeric field plus negative-runtime and below-minimum target-count. All 55 tests pass. The schema already used "type": "integer" for these fields (which in JSON Schema 2020-12 explicitly excludes booleans), so editor-side validation was already correct; this commit closes the runtime gap. Generated pipeline YAML still regenerates byte-identical to HEAD. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

@sebastienros

Per @sebastienros's review feedback on aspnet#2170, the named-key form for `machines` / `profiles` was visually noisy when almost every pod follows the same SUT-load-DB convention. Both blocks now accept either: * a positional list of 1-3 entries (`[sut]` / `[sut, load]` / `[sut, load, db]`), preferred for new pods, or * the existing named-dict form (`{sut: ..., load: ..., db: ...}`), still supported. The two blocks for a single pod must use the same form and declare the same set of roles; mixing list and dict, or differing lengths, raises `ConfigError` at load time so the role mapping can't drift silently. Schema (pod-config.schema.json) gets a `oneOf` for both blocks; the YAML LSP catches shape errors at edit time. * `config_loader._normalize_roles` resolves either form to a canonical ordered dict and rejects unknown roles, bools, non-strings, and empty values. The pod loader then enforces same-shape and same-role-set between machines and profiles before constructing the Pod. * All three committed configs (`benchmarks_ci_pods.yml`, `benchmarks_ci_azure_pods.yml`, `benchmarks_ci_cobalt_pods.yml`) switch to the shorthand form. Each gets a header comment above its pods block explaining the convention. Pods that re-use a machine across roles (e.g. `cobalt-cloud-lin-azl3-dual` puts the db box in the `load` slot) keep the inline comment that flags the reuse. * `scripts/pod-scheduler/README.md` gains a Pod Definition section showing both forms and recommending the list form for new pods. * test_config_loader.py adds 10 cases covering shorthand happy paths (1/2/3 entries), mixed-shape rejection, length mismatch, empty/too-long lists, bool entries, and unknown-role keys. All 65 tests pass. Generated pipeline YAMLs are byte-identical to HEAD: this is purely an input-format change, the schedule data is unchanged. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

sebastienros · 2026-05-07T22:10:56Z

-    machines: { sut: intel-lin, load: intel-load,  db: intel-db }
-    profiles: { sut: intel-lin-app, load: intel-load-load,  db: intel-db-db }
+    machines: [intel-lin, intel-load,  intel-db]
+    profiles: [intel-lin-app, intel-load-load,  intel-db-db]


What I meant is that the one of these should not need the role name (application, load, db), scheduling doesn't care about it.

sebastienros · 2026-05-07T22:12:32Z

    # of the same stage when both run a triple-type scenario.
-    machines: { sut: intel-win, load: intel-load2, db: intel-db }
-    profiles: { sut: intel-win-app, load: intel-load2-load, db: intel-db-db }
+    machines: [intel-win, intel-load2, intel-db]


Feels weird to use the json syntax in yml (though totally valid)

machines: - intel-win - intel-load2 - intel-db

Also note that YML "IS" JSON, so a JSON document is valid yaml, in case in python you'd want to only handle the yml parsing and not check the extension, just say it's yaml.

@sebastienros

Two simplifications from @sebastienros's round 3 review on aspnet#2170: * Drop the per-extension dispatch in config_loader._load_raw. YAML is a strict superset of JSON, so yaml.safe_load handles both. Old .json configs still load via the YAML parser, and the loader no longer cares about the file extension. _load_raw is a one-liner now. * Reshape pod definitions into parallel arrays. Both `machines` and `profiles` are flat 1-3 entry lists; machines[i] runs profiles[i]. The role of slot i is implicit by length (1=SUT, 2=SUT+load, 3=SUT+load+db) so neither list carries explicit `sut`/`load`/`db` keys. The named-dict form is no longer accepted. The loader now enforces, per pod: * len(machines) == len(profiles), so each machine pairs with exactly one profile. * machines and profiles are both lists of 1-3 unique non-empty strings. * bools (yes/no/true/false from YAML) are rejected explicitly. Per-pod uniqueness is enforced; global `machine -> single profile` isn't, because real configs legitimately reuse a physical box across roles in different pods (e.g. azure2-db is db for one pod and load for another). Test `test_global_machine_can_have_different_profiles_across_pods` documents that as intended. Configs: * All three .yml configs reformatted from inline flow style `[a, b, c]` to block style and converted to the parallel-array form. Each retains the existing inline comment for pods that re-use machines across roles. The header comment over each file's pods block now describes the parallel-array semantics. * Schema (`pod-config.schema.json`) tightens machines/profiles to `array of unique strings, minItems 1, maxItems 3`; the old oneOf with the dict variant is gone. * README's Pod Definition section rewritten around the new shape. Tests: * Drop tests for removed behavior (extension rejection, dict form, named-role rejection, mixed-shape rejection). * Add `TestPodRoles` covering single/dual/triple happy paths, length-mismatch, machines uniqueness, profiles uniqueness, empty/too- long lists, dict form rejection, bool entries, and the global same-machine/different-profile case. * 66 tests pass (was 65). Generated pipeline YAMLs are byte-identical to HEAD; this is purely an input-format simplification. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Future-proofing tweak following the round-3 review on aspnet#2170. Removes three of the four `maxItems: 3` hardcoding sites so adding a fourth role later (e.g. a second load machine) only requires a schema bump. * models.ScenarioType IntEnum is gone. `Scenario.type` is now a plain `int` (1/2/3 == single/dual/triple). `SCENARIO_TYPE_SINGLE` / `DUAL` / `TRIPLE` module constants stay for readable references. `DEFAULT_RUNTIMES` is keyed by int. * `Pod` no longer has `sut` / `load` / `db` / `*_profile` fields. The dataclass stores `machines: List[str]` and `profiles: List[str]` (parallel arrays). Backward-compat properties (`pod.sut`, `pod.load`, etc.) read from the lists, so call sites that need a named slot (the conflicts diagnostic) keep working unchanged. * `pod.machines_for_type(t)` and `profiles_for_type(t)` are slices (`self.machines[:t]`); `validate(t)` is just `t > len(machines)`. No more 2/3 magic numbers branching on which fields are populated. * `__post_init__` enforces `len(machines) == len(profiles)` and non-empty machines, so an invalid Pod can't be constructed. * `main.print_pod_conflicts` iterates `pod.machines` against `ROLE_NAMES` (a new module-level constant) for the diagnostic output, so adding a 4th role just needs a name appended to `ROLE_NAMES` -- everything else falls out of `len(pod.machines)`. * All call sites and tests updated. `test_models` gains 4 cases exercising the new constructor invariants and the backward-compat property accessors. 70 tests pass (was 66). The schema's `maxItems: 3` is now the single remaining `3` hardcoding -- that's the user-facing constraint, which is the right place for it. Generated pipeline YAMLs are byte-identical to HEAD; this is purely an internal model refactor. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

LoopedBard3 self-assigned this May 7, 2026

LoopedBard3 requested a review from Copilot May 7, 2026 20:27

Copilot started reviewing on behalf of LoopedBard3 May 7, 2026 20:27 View session

Copilot AI reviewed May 7, 2026

View reviewed changes

Comment thread scripts/pod-scheduler/config_loader.py

Comment thread scripts/pod-scheduler/config_loader.py Outdated

Comment thread scripts/pod-scheduler/README.md

sebastienros reviewed May 7, 2026

View reviewed changes

Comment thread build/benchmarks_ci_azure_pods.yml Outdated

LoopedBard3 and others added 2 commits May 7, 2026 14:45

sebastienros reviewed May 7, 2026

View reviewed changes

LoopedBard3 and others added 2 commits May 7, 2026 15:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Switch pod-scheduler configs from JSON to YAML; add JSON Schema#2170

Switch pod-scheduler configs from JSON to YAML; add JSON Schema#2170
LoopedBard3 wants to merge 5 commits intoaspnet:mainfrom
LoopedBard3:loopedbard3/pod-config-format-options

LoopedBard3 commented May 7, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sebastienros May 7, 2026

Uh oh!

sebastienros May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

LoopedBard3 commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why

What

Format

Self-documenting type

Schema

Loader

Generated pipelines

Tests

Docs

Verification

Notes for reviewers

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sebastienros May 7, 2026

Choose a reason for hiding this comment

Uh oh!

sebastienros May 7, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

LoopedBard3 commented May 7, 2026 •

edited

Loading

Self-documenting `type`