Skip to content

fix: add opencode-zen, xiaomi, nvidia to STREAMING_ENDPOINT_PROVIDERS#2141

Open
TF0rd wants to merge 1 commit into
mnfst:mainfrom
TF0rd:fix/add-streaming-providers
Open

fix: add opencode-zen, xiaomi, nvidia to STREAMING_ENDPOINT_PROVIDERS#2141
TF0rd wants to merge 1 commit into
mnfst:mainfrom
TF0rd:fix/add-streaming-providers

Conversation

@TF0rd

@TF0rd TF0rd commented Jun 7, 2026

Copy link
Copy Markdown

Problem

The response-mode-guard silently drops routes from providers not in STREAMING_ENDPOINT_PROVIDERS when a tier uses response_mode: stream. For users routing through opencode-zen or xiaomi, this means their primary and fallback models are filtered out at runtime with no warning, and the first stream-capable fallback gets silently promoted — which is confusing and defeats the purpose of having fallback chains.

Specifically:

  • opencode-zen (MiniMax M3, Qwen 3.6 Plus, etc.) — filtered out
  • xiaomi (MiMo-V2.5, MiMo-V2.5-Pro) — filtered out
  • nvidia (Nemotron 3 Ultra, Nemotron 3 Super, etc.) — filtered out

All three providers support SSE streaming. Their entries in PROVIDER_ENDPOINTS (provider-endpoints.ts) already include ...openaiStreamUsage, which configures the proxy to request stream_options.include_usage for streaming responses.

Fix

Add 'opencode-zen', 'xiaomi', and 'nvidia' to the STREAMING_ENDPOINT_PROVIDERS set in model-capabilities.ts.

Related

  • A PR to add the nvidia provider to modelparams.dev: mnfst/modelparams.dev#48
  • PROVIDER_ENDPOINTS entries for xiaomi-subscription (L246-250), opencode-zen (L419-424), and nvidia (L267-272) in provider-endpoints.ts

Summary by cubic

Add opencode-zen, xiaomi, and nvidia to STREAMING_ENDPOINT_PROVIDERS so streaming tiers route correctly instead of being dropped by response-mode-guard. All three support SSE via OpenAI‑compatible chat completions; no other changes.

Written for commit 14eb380. Summary will update on new commits.

Review in cubic

The `response-mode-guard` silently drops routes from providers not in
`STREAMING_ENDPOINT_PROVIDERS` when the tier's response mode is
'stream'. This causes complex-tier fallback chains to skip
opencode-zen (MiniMax M3) and xiaomi (MiMo-V2.5-Pro) entirely,
promoting the second or third fallback to primary without any log
output or user-visible indication.

All three providers expose OpenAI-compatible SSE streaming:

- opencode-zen: opencode.ai/zen/v1/chat/completions
- xiaomi: token-plan-{region}.xiaomimimo.com/v1/chat/completions
- nvidia: integrate.api.nvidia.com/v1/chat/completions

Their PROVISIONER_ENDPOINTS entries already include
`...openaiStreamUsage`, confirming server-side streaming support.

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 1 file

Re-trigger cubic

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant