Add provider model and token limit overrides to ProviderConfig#966
Add provider model and token limit overrides to ProviderConfig#966MackinnonBuck merged 8 commits intomainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds four optional token-limit configuration fields to ProviderConfig across the Node.js, Python, Go, and .NET SDKs so BYOK/custom-provider users can override model token limits and/or specify an alternate model ID for capability-catalog limit lookup.
Changes:
- Added
maxOutputTokens,maxPromptTokens,maxContextWindowTokens, andmodelLimitsIdtoProviderConfigin Node.js, Go, and .NET. - Added corresponding snake_case fields to Python
ProviderConfigand mapped them to camelCase wire keys in the Python client. - Documented the behavior/precedence of these fields via inline comments/XML docs.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
nodejs/src/types.ts |
Extends TS ProviderConfig interface with optional token limit fields and modelLimitsId. |
python/copilot/session.py |
Extends Python ProviderConfig TypedDict with optional token-limit and model-limit lookup fields. |
python/copilot/client.py |
Maps new snake_case Python fields onto the expected camelCase wire names. |
go/types.go |
Extends Go ProviderConfig struct with new JSON fields (omitempty). |
dotnet/src/Types.cs |
Extends .NET ProviderConfig with nullable properties and JSON property names. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
| /// When set, takes precedence over the default limit resolved from the model's capability catalog entry. | ||
| /// </summary> | ||
| [JsonPropertyName("maxPromptTokens")] | ||
| public int? MaxPromptTokens { get; set; } |
There was a problem hiding this comment.
Should this be called MaxInputTokens instead of MaxPromptTokens?
Does this include cached tokens?
Same question as above... is this about one request or across a sequence of calls?
There was a problem hiding this comment.
Per-request, but not sent to the API. The runtime uses this internally to decide when to truncate or compact conversation history before each LLM call. "Prompt" here means everything sent to the model in one request: system message, full conversation history up to that point, tool definitions, and the new user message. Cached tokens are counted toward the limit.
The name matches the upstream CAPI /models field (max_prompt_tokens), though MaxInputTokens would also be reasonable.
There was a problem hiding this comment.
I'd prefer MaxInputTokens. That's the more modern terminology, right? And it maps to what we show in the CLI UI?
There was a problem hiding this comment.
We could change this; it would just require changing the runtime representation as well (including the COPILOT_PROVIDER_MAX_PROMPT_TOKENS environment variable, which would probably need to be renamed).
There was a problem hiding this comment.
Changing the sdk name wouldn't, right? Only if we also wanted to change the wire name?
There was a problem hiding this comment.
Oh, if we just changed the public API but configured it to serialize with the maxInputTokens naming? Yeah that would work.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
Hi @MackinnonBuck! Just wanted to follow up on this PR — is it still pending review? Adding token limit fields to ProviderConfig across all SDKs seems like a key capability for BYOK users who need to configure custom providers with specific limits. Would be great to see this move forward. Is there anything blocking it from being marked as ready for review? Thanks. |
|
@MackinnonBuck, are you still working on this? |
|
@stephentoub Thanks for the ping. I was holding off on this because we've had requests on the CLI side to allow for multiple provider configs / models per provider config, and I wanted to make sure we were properly accounting for that on the SDK APIs. However, adding new fields to the existing provider config SDK type probably doesn't make it any harder for us to adjust the API later, so it's probably fine to take this now. I'll work on getting this in a ready-to-review state tomorrow. |
Adds the following optional fields to ProviderConfig across all SDKs: - modelId: well-known model ID for agent config + token limit lookup - wireModel: model name sent to the provider API for inference - maxPromptTokens: prompt token cap (triggers compaction) - maxOutputTokens: response token cap Both modelId and wireModel default to the session's configured model when unset. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
47d0781 to
f993631
Compare
Extends existing provider-forwarding/serialization tests across all 4 SDKs to cover modelId, wireModel, maxPromptTokens, and maxOutputTokens. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
4778c24 to
09c17f7
Compare
This comment has been minimized.
This comment has been minimized.
| /// when not explicitly set. | ||
| /// </summary> | ||
| [JsonPropertyName("modelId")] | ||
| public string? ModelId { get; set; } |
There was a problem hiding this comment.
So there's:
SessionConfig.Model
ProviderConfig.ModelId
ProviderConfig.WireModel
Help me understand the relationship? If I specify SessionConfig.Model, it's used as the default for both options on ProviderConfig, and those options on provider config then represent the two different groupings in which a model would be used, such that I can override one of them? Is there any situation where I would specify the same model ID for both ModelId and WireModel?
There was a problem hiding this comment.
If only ProviderConfig.ModelId is specified, then that controls multiple things:
- The model name used by the runtime to determine model limits and model-specific agent configuration
- The model name sent to the custom provider for inference
If the model provider recognizes a model name that doesn't match the model ID known by the runtime, then ProviderConfig.WireModel can specify that.
SessionConfig.Model acts a default in case neither option is specified.
Is there any situation where I would specify the same model ID for both ModelId and WireModel?
It has the same effect as just specifying ModelId, so it's not really necessary.
| /// when not explicitly set. | ||
| /// </summary> | ||
| [JsonPropertyName("wireModel")] | ||
| public string? WireModel { get; set; } |
There was a problem hiding this comment.
Out of the three:
SessionConfig.Model
ProviderConfig.ModelId
ProviderConfig.WireModel
what's the meaning behind ProviderConfig.ModelId having an "Id" suffix and the other two not?
There was a problem hiding this comment.
I figured the "ID" more strongly implied that the value was identifying a well-known model kind. We could also consider ModelFamily? It would just require changing the runtime as well.
Adds two end-to-end tests per SDK (Node, Python, Go, .NET) that exercise the new ProviderConfig fields against the replaying CAPI proxy: - should_forward_provider_wire_model_and_max_output_tokens: verifies wireModel overrides the wire request model and maxOutputTokens is forwarded as max_tokens. - should_use_provider_model_id_as_wire_model: verifies modelId acts as the wire model when wireModel is unspecified and SessionConfig.Model is omitted. Also adds MaxTokens to the Go and .NET ChatCompletionRequest harness types so the assertion is observable, and ships two shared snapshot YAMLs under test/snapshots/session_config/. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This comment has been minimized.
This comment has been minimized.
The OpenAI BYOK provider code path in the CLI does not echo the configured maxOutputTokens as max_tokens on the wire request body (it's used internally for token budgeting and only appears on Anthropic-style requests). The new wire model E2E test asserted on max_tokens in the captured chat completion request, which always returned undefined/nil and failed across all four SDKs. Rename the test to `should forward provider wire model'' and drop the wire-side max_tokens assertion. The test still sets maxOutputTokens to confirm the SDK serializes the field without errors; per-SDK unit tests already cover ProviderConfig serialization in detail. Also drop the now-unused MaxTokens field from the Go and .NET harness ChatCompletionRequest types. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This comment has been minimized.
This comment has been minimized.
Renames the SDK-facing ProviderConfig field across all four languages while preserving the wire JSON key as maxPromptTokens: - .NET: MaxPromptTokens -> MaxInputTokens (JsonPropertyName unchanged) - Go: MaxPromptTokens -> MaxInputTokens (json tag unchanged) - Python: max_prompt_tokens -> max_input_tokens (wire conversion in _convert_provider_to_wire_format unchanged) - Node: maxPromptTokens -> maxInputTokens; adds a small toWireProviderConfig helper in client.ts that remaps the field before sending session.create / session.resume. Also rewrites the doc comments for modelId, wireModel, maxInputTokens, and maxOutputTokens to make the priority order clear: WireModel falls back to ModelId falls back to SessionConfig.Model, and ModelId drives both runtime configuration lookup and the wire model when WireModel is unset. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This comment has been minimized.
This comment has been minimized.
Remove 'using var' from three ClientE2ETests that also call ForceStopAsync in their finally block. The double-disposal (using Dispose → DisposeAsync → ForceStopAsync, plus the explicit ForceStopAsync) races on Windows when the CLI process/pipes are mid-teardown, causing OperationCanceledException to bubble up and fail an otherwise-passing test. Matches the existing pattern in SessionFsE2ETests where ForceStopAsync is wrapped in try-catch to swallow teardown-only exceptions. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This reverts commit 6c794c4.
…byok-provider-token-limits
Cross-SDK Consistency Review ✅All four SDK implementations (Node.js, Python, Go, .NET) are consistent in this PR. Specifically:
Minor PR description noteThe table in the PR summary has a small labeling inconsistency in the "Wire name" column — the row for the prompt-token limit lists
|
Brought in 12 commits from origin/main, including CLI bumps to 1.0.41-0 and 1.0.41-1, plus upstream PR #966 ("Add provider model and token limit overrides to ProviderConfig"). One trivial codegen diff (single doc-comment update on `CustomAgentsUpdatedAgent.tools` for the new "or null when all tools are available" semantics). PR #966 added four new fields to ProviderConfig across all SDKs: - `model_id: Option<String>` (well-known model ID for agent config + token limit lookup; falls back to SessionConfig::model) - `wire_model: Option<String>` (model name sent to provider API for inference; falls back to model_id, then to SessionConfig::model) - `max_prompt_tokens: Option<i64>` (overrides resolved model's default max prompt tokens; triggers compaction) - `max_output_tokens: Option<i64>` (overrides resolved model's default max output tokens; truncates response) Plus matching `with_*` builders. Wire-shape: camelCase (`modelId`/`wireModel`/`maxPromptTokens`/`maxOutputTokens`), skip_serializing_if when unset. Extended `provider_config_builder_composes` test to exercise all four fields and assert their wire shape (camelCase + omission). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Summary
Extends
ProviderConfigacross all four language SDKs with optional fields for model overrides and token limits. These fields match the runtime's wire format and let BYOK users decouple the model name sent to their provider from the well-known model ID used for agent configuration and capability lookup.New fields
modelIdmodelId?: stringmodel_id: strstring? ModelIdModelID stringwireModelwireModel?: stringwire_model: strstring? WireModelWireModel stringmaxInputTokensmaxInputTokens?: numbermax_input_tokens: intint? MaxInputTokensMaxInputTokens intmaxOutputTokensmaxOutputTokens?: numbermax_output_tokens: intint? MaxOutputTokensMaxOutputTokens intAll fields are optional.
modelId— Well-known model ID used to look up agent configuration (tools, prompts, reasoning behavior) and default token limits from the capability catalog. Useful for fine-tunes that should inherit a base model''s configuration. Defaults to the session''s configured model (SessionConfig.model) when unset.wireModel— Model identifier sent to the provider API for inference. Use this when the name your provider knows (e.g. an Azure deployment name or custom fine-tune name) differs from the well-known model ID. Defaults to the session''s configured model when unset.maxPromptTokens— Maximum tokens allowed in the prompt for a single request. The runtime triggers conversation compaction before sending a request when the prompt exceeds this limit.maxOutputTokens— Maximum tokens the model can generate in a single response.modelIdandwireModelare independent knobs — set just one or both depending on whether you need to override config lookup, wire transmission, or both.Changes
nodejs/src/types.ts) — Added fields toProviderConfiginterfacepython/copilot/session.py) — Added fields toProviderConfigTypedDictpython/copilot/client.py) — Updated_convert_provider_to_wire_formatto map new snake_case fields to camelCasedotnet/src/Types.cs) — Added nullable properties with[JsonPropertyName]attributesgo/types.go) — Added fields withjson:"...,omitempty"tagsTesting
Extended existing provider-forwarding/serialization unit tests across all four SDKs to cover the new fields:
nodejs/test/client.test.ts) — Existingforwards provider headerstests forsession.create/session.resumenow also assert all four new fields on the wire payloadpython/test_client.py) — Same — extendedtest_create_session_forwards_provider_headers/_resume_to assert camelCase mapping for all four new fieldsdotnet/test/Unit/SerializationTests.cs) — ExtendedProviderConfig_CanSerializeHeaders_WithSdkOptionsto round-trip all four new fields with correct JSON property namesgo/types_test.go) — AddedTestProviderConfig_JSONIncludesAllFields(correct JSON tags) andTestProviderConfig_JSONOmitsUnsetTokenFields(omitemptyworks)All four SDKs build clean (
tsc --noEmit,go build ./...,dotnet build,python -c "import copilot") and all unit tests pass.