fix(clients): harden structured-output fallback paths#577
fix(clients): harden structured-output fallback paths#577Niko96-dotcom wants to merge 1 commit intoplastic-labs:mainfrom
Conversation
WalkthroughThis PR enhances fact extraction prompting to enforce stricter constraints on explicit fact identification and adds a structured output error recovery mechanism for OpenAI-compatible clients that retries failed parsing attempts using JSON repair fallback strategies. Changes
Sequence DiagramsequenceDiagram
participant Client as LLM Client
participant API as OpenAI API
participant Validator as JSON Validator
participant Repairer as JSON Repairer
Client->>API: chat.completions.parse(response_model=...)
API-->>Client: Exception (ValidationError/ValueError)
Client->>Client: _should_retry_with_structured_fallback(exc)?
alt Eligible for Fallback
Client->>Client: Log warning
alt Custom Provider + Minimax
Client->>Client: Remove response_format<br/>Add system prompt<br/>Clamp max_tokens
else Other Providers
Client->>Client: Set response_format<br/>to json_object
end
Client->>API: chat.completions.create(...)
API-->>Client: Raw text response
Client->>Repairer: validate_and_repair_json(content)
Repairer-->>Client: Repaired JSON
Client->>Validator: response_model.model_validate_json(repaired)
Validator-->>Client: Parsed model instance
Client->>Client: Return HonchoLLMCallResponse<br/>(fallback usage, empty tool_calls)
else Not Eligible
Client->>Client: Re-raise exception
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
tests/utils/test_clients.py (1)
543-620: Assert the fallback request shape too.These tests should lock down the request mutations that make the fallback safe: generic fallback should set JSON mode, and the MiniMax branch should clamp
max_tokensto2000.Suggested additions
assert isinstance(response.content, SampleTestModel) assert response.content.name == "Jane" mock_client.chat.completions.create.assert_called_once() + create_kwargs = mock_client.chat.completions.create.call_args.kwargs + assert create_kwargs["response_format"] == {"type": "json_object"} async def test_custom_minimax_prompt_representation_fallback_drops_response_format(self): from openai import AsyncOpenAI @@ assert "response_format" not in create_kwargs + assert create_kwargs["max_tokens"] == 2000 assert create_kwargs["messages"][0]["role"] == "system" assert "Return only a single JSON object with key explicit" in create_kwargs["messages"][0]["content"]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/utils/test_clients.py` around lines 543 - 620, The tests are missing assertions that the fallback request shape is safe: when honcho_llm_call_inner falls back from parse error it must set JSON mode (e.g., include response_format or a message instructing "Return only a single JSON object") and for the MiniMax branch it must clamp max_tokens to 2000; update the two tests to inspect mock_client.chat.completions.create.call_args.kwargs from honcho_llm_call_inner and assert that (1) for the generic openai fallback the outgoing kwargs either include response_format="json" or the system/user message contains a clear "Return only a single JSON object" JSON instruction, and (2) for provider="custom" with OPENAI_COMPATIBLE_BASE_URL set to MiniMax, the create call uses max_tokens <= 2000 (i.e., clamped to 2000) and still drops response_format from kwargs as existing assertions expect.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/utils/clients.py`:
- Around line 2048-2053: The current logger.warning call logs the raw parse_exc
which may contain sensitive parsed content; update the log to record only the
exception type and the provider/model context. In the logger.warning invocation
(logger.warning(...)) replace the parse_exc argument with
type(parse_exc).__name__ (or parse_exc.__class__.__name__) and remove any raw
exception message; keep provider and model in the message so the log shows the
exception type and context but not user-derived content.
- Around line 37-41: The fallback classifier in
_should_retry_with_structured_fallback is too broad (it currently includes
TypeError and AttributeError) and can hide runtime bugs by triggering a second
LLM call; update the function to only treat parse/schema-related errors as
retryable (e.g., keep ValidationError and json.JSONDecodeError and, if needed,
ValueError for parsing contexts) and remove TypeError and AttributeError from
the exception tuple so local post-parse runtime errors don't cause a structured
fallback.
---
Nitpick comments:
In `@tests/utils/test_clients.py`:
- Around line 543-620: The tests are missing assertions that the fallback
request shape is safe: when honcho_llm_call_inner falls back from parse error it
must set JSON mode (e.g., include response_format or a message instructing
"Return only a single JSON object") and for the MiniMax branch it must clamp
max_tokens to 2000; update the two tests to inspect
mock_client.chat.completions.create.call_args.kwargs from honcho_llm_call_inner
and assert that (1) for the generic openai fallback the outgoing kwargs either
include response_format="json" or the system/user message contains a clear
"Return only a single JSON object" JSON instruction, and (2) for
provider="custom" with OPENAI_COMPATIBLE_BASE_URL set to MiniMax, the create
call uses max_tokens <= 2000 (i.e., clamped to 2000) and still drops
response_format from kwargs as existing assertions expect.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 0eb0ff57-e7f0-4257-ba11-2927a09b1306
📒 Files selected for processing (3)
src/deriver/prompts.pysrc/utils/clients.pytests/utils/test_clients.py
| def _should_retry_with_structured_fallback(exc: Exception) -> bool: | ||
| """Only fallback for parse/schema drift, not generic provider failures.""" | ||
| if isinstance(exc, (ValidationError, ValueError, json.JSONDecodeError, TypeError, AttributeError)): | ||
| return True | ||
| return False |
There was a problem hiding this comment.
Narrow the fallback classifier to parse/schema errors.
TypeError and AttributeError can be raised by local post-parse processing in the later try block, so this can mask runtime bugs by issuing a second LLM call instead of failing normally.
Suggested fix
def _should_retry_with_structured_fallback(exc: Exception) -> bool:
"""Only fallback for parse/schema drift, not generic provider failures."""
- if isinstance(exc, (ValidationError, ValueError, json.JSONDecodeError, TypeError, AttributeError)):
- return True
- return False
+ return isinstance(exc, (ValidationError, ValueError, json.JSONDecodeError))📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def _should_retry_with_structured_fallback(exc: Exception) -> bool: | |
| """Only fallback for parse/schema drift, not generic provider failures.""" | |
| if isinstance(exc, (ValidationError, ValueError, json.JSONDecodeError, TypeError, AttributeError)): | |
| return True | |
| return False | |
| def _should_retry_with_structured_fallback(exc: Exception) -> bool: | |
| """Only fallback for parse/schema drift, not generic provider failures.""" | |
| return isinstance(exc, (ValidationError, ValueError, json.JSONDecodeError)) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/utils/clients.py` around lines 37 - 41, The fallback classifier in
_should_retry_with_structured_fallback is too broad (it currently includes
TypeError and AttributeError) and can hide runtime bugs by triggering a second
LLM call; update the function to only treat parse/schema-related errors as
retryable (e.g., keep ValidationError and json.JSONDecodeError and, if needed,
ValueError for parsing contexts) and remove TypeError and AttributeError from
the exception tuple so local post-parse runtime errors don't cause a structured
fallback.
| logger.warning( | ||
| "Structured parse failed for %s/%s; retrying with JSON repair fallback: %s", | ||
| provider, | ||
| model, | ||
| parse_exc, | ||
| ) |
There was a problem hiding this comment.
Avoid logging raw parse exceptions from structured output.
parse_exc can include validation input or parsed user facts, which may leak sensitive deriver content into application logs. Log the exception type and provider/model context instead.
Suggested fix
logger.warning(
- "Structured parse failed for %s/%s; retrying with JSON repair fallback: %s",
+ "Structured parse failed for %s/%s; "
+ "retrying with JSON repair fallback (%s)",
provider,
model,
- parse_exc,
+ type(parse_exc).__name__,
)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/utils/clients.py` around lines 2048 - 2053, The current logger.warning
call logs the raw parse_exc which may contain sensitive parsed content; update
the log to record only the exception type and the provider/model context. In the
logger.warning invocation (logger.warning(...)) replace the parse_exc argument
with type(parse_exc).__name__ (or parse_exc.__class__.__name__) and remove any
raw exception message; keep provider and model in the message so the log shows
the exception type and context but not user-derived content.
Summary
PromptRepresentationinstead of every response modelTest Plan
set -a && source /Users/nikolaymohr/honcho/.env && set +a && uv run pytest tests/utils/test_clients.py -qWhy
The current structured-output path is too brittle for messy OpenAI-compatible providers. This makes fallback behavior explicit instead of magical and reduces the odds of deriver batches dying on parse drift.
Summary by CodeRabbit
Release Notes
Bug Fixes
Improvements