fix(clients): harden structured-output fallback paths by Niko96-dotcom · Pull Request #577 · plastic-labs/honcho

Niko96-dotcom · 2026-04-18T13:04:59Z

Summary

tighten the deriver prompt to require exact JSON output and forbid unstated general-knowledge leakage
add structured-output fallback after parse/schema drift for OpenAI-compatible providers
scope the MiniMax custom-provider fallback to PromptRepresentation instead of every response model
avoid retrying generic provider/runtime failures through the structured fallback path
add tests covering validation fallback, non-fallback runtime errors, and the MiniMax prompt-representation branch

Test Plan

set -a && source /Users/nikolaymohr/honcho/.env && set +a && uv run pytest tests/utils/test_clients.py -q

Why

The current structured-output path is too brittle for messy OpenAI-compatible providers. This makes fallback behavior explicit instead of magical and reduces the odds of deriver batches dying on parse drift.

Summary by CodeRabbit

Release Notes

Bug Fixes
- Added fallback retry mechanism to gracefully handle structured output parsing failures, improving system resilience.
- Stricter fact extraction now recognizes only explicitly stated information, avoiding inferred facts.
Improvements
- Enhanced output formatting constraints for consistency and reliability.

coderabbitai · 2026-04-18T13:05:17Z

Walkthrough

This PR enhances fact extraction prompting to enforce stricter constraints on explicit fact identification and adds a structured output error recovery mechanism for OpenAI-compatible clients that retries failed parsing attempts using JSON repair fallback strategies.

Changes

Cohort / File(s)	Summary
Prompt Refinement `src/deriver/prompts.py`	Removed guidance allowing inference of facts from general knowledge; added strict JSON output format requirement (`{"explicit": [{"content": "..."}]}`) and updated examples to treat inferred facts (e.g., "lives in NYC" vs. observed "was in NYC") as non-explicit.
Structured Output Error Handling `src/utils/clients.py`	Added `_should_retry_with_structured_fallback()` helper to classify parse/schema exceptions as fallback-eligible. Wrapped `chat.completions.parse()` in try/except; on failure, retries with JSON repair fallback—for Minimax provider removes `response_format` and adds constraining system prompt; otherwise uses standard `json_object` format. Repairs output via `validate_and_repair_json` and validates with `response_model.model_validate_json`.
Test Coverage `tests/utils/test_clients.py`	Added tests for structured output error handling: validates non-eligible errors (e.g., `RuntimeError`) propagate without fallback, verifies `ValueError` triggers fallback path returning valid parsed model, and tests Minimax-specific fallback with JSON repair and system prompt injection.

Sequence Diagram

sequenceDiagram
    participant Client as LLM Client
    participant API as OpenAI API
    participant Validator as JSON Validator
    participant Repairer as JSON Repairer

    Client->>API: chat.completions.parse(response_model=...)
    API-->>Client: Exception (ValidationError/ValueError)
    
    Client->>Client: _should_retry_with_structured_fallback(exc)?
    alt Eligible for Fallback
        Client->>Client: Log warning
        alt Custom Provider + Minimax
            Client->>Client: Remove response_format<br/>Add system prompt<br/>Clamp max_tokens
        else Other Providers
            Client->>Client: Set response_format<br/>to json_object
        end
        Client->>API: chat.completions.create(...)
        API-->>Client: Raw text response
        Client->>Repairer: validate_and_repair_json(content)
        Repairer-->>Client: Repaired JSON
        Client->>Validator: response_model.model_validate_json(repaired)
        Validator-->>Client: Parsed model instance
        Client->>Client: Return HonchoLLMCallResponse<br/>(fallback usage, empty tool_calls)
    else Not Eligible
        Client->>Client: Re-raise exception
    end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested reviewers

VVoruganti

Poem

🐰 A fallback so clever, a JSON repair,
When parsing goes awry, we handle with care,
Explicit facts only, no guesses allowed,
Structured outputs fixed—the code wears its shroud! 🎩

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 42.86% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'fix(clients): harden structured-output fallback paths' directly addresses the main change: adding robustness to structured-output fallback handling in the clients module.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

tests/utils/test_clients.py (1)

543-620: Assert the fallback request shape too.

These tests should lock down the request mutations that make the fallback safe: generic fallback should set JSON mode, and the MiniMax branch should clamp max_tokens to 2000.

Suggested additions

         assert isinstance(response.content, SampleTestModel)
         assert response.content.name == "Jane"
         mock_client.chat.completions.create.assert_called_once()
+        create_kwargs = mock_client.chat.completions.create.call_args.kwargs
+        assert create_kwargs["response_format"] == {"type": "json_object"}
 
     async def test_custom_minimax_prompt_representation_fallback_drops_response_format(self):
         from openai import AsyncOpenAI
@@
         assert "response_format" not in create_kwargs
+        assert create_kwargs["max_tokens"] == 2000
         assert create_kwargs["messages"][0]["role"] == "system"
         assert "Return only a single JSON object with key explicit" in create_kwargs["messages"][0]["content"]

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/utils/test_clients.py` around lines 543 - 620, The tests are missing
assertions that the fallback request shape is safe: when honcho_llm_call_inner
falls back from parse error it must set JSON mode (e.g., include response_format
or a message instructing "Return only a single JSON object") and for the MiniMax
branch it must clamp max_tokens to 2000; update the two tests to inspect
mock_client.chat.completions.create.call_args.kwargs from honcho_llm_call_inner
and assert that (1) for the generic openai fallback the outgoing kwargs either
include response_format="json" or the system/user message contains a clear
"Return only a single JSON object" JSON instruction, and (2) for
provider="custom" with OPENAI_COMPATIBLE_BASE_URL set to MiniMax, the create
call uses max_tokens <= 2000 (i.e., clamped to 2000) and still drops
response_format from kwargs as existing assertions expect.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/utils/clients.py`:
- Around line 2048-2053: The current logger.warning call logs the raw parse_exc
which may contain sensitive parsed content; update the log to record only the
exception type and the provider/model context. In the logger.warning invocation
(logger.warning(...)) replace the parse_exc argument with
type(parse_exc).__name__ (or parse_exc.__class__.__name__) and remove any raw
exception message; keep provider and model in the message so the log shows the
exception type and context but not user-derived content.
- Around line 37-41: The fallback classifier in
_should_retry_with_structured_fallback is too broad (it currently includes
TypeError and AttributeError) and can hide runtime bugs by triggering a second
LLM call; update the function to only treat parse/schema-related errors as
retryable (e.g., keep ValidationError and json.JSONDecodeError and, if needed,
ValueError for parsing contexts) and remove TypeError and AttributeError from
the exception tuple so local post-parse runtime errors don't cause a structured
fallback.

---

Nitpick comments:
In `@tests/utils/test_clients.py`:
- Around line 543-620: The tests are missing assertions that the fallback
request shape is safe: when honcho_llm_call_inner falls back from parse error it
must set JSON mode (e.g., include response_format or a message instructing
"Return only a single JSON object") and for the MiniMax branch it must clamp
max_tokens to 2000; update the two tests to inspect
mock_client.chat.completions.create.call_args.kwargs from honcho_llm_call_inner
and assert that (1) for the generic openai fallback the outgoing kwargs either
include response_format="json" or the system/user message contains a clear
"Return only a single JSON object" JSON instruction, and (2) for
provider="custom" with OPENAI_COMPATIBLE_BASE_URL set to MiniMax, the create
call uses max_tokens <= 2000 (i.e., clamped to 2000) and still drops
response_format from kwargs as existing assertions expect.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 0eb0ff57-e7f0-4257-ba11-2927a09b1306

📥 Commits

Reviewing files that changed from the base of the PR and between 9676526 and e874119.

📒 Files selected for processing (3)

src/deriver/prompts.py
src/utils/clients.py
tests/utils/test_clients.py

coderabbitai · 2026-04-18T13:10:11Z

+def _should_retry_with_structured_fallback(exc: Exception) -> bool:
+    """Only fallback for parse/schema drift, not generic provider failures."""
+    if isinstance(exc, (ValidationError, ValueError, json.JSONDecodeError, TypeError, AttributeError)):
+        return True
+    return False


⚠️ Potential issue | 🟠 Major

Narrow the fallback classifier to parse/schema errors.

TypeError and AttributeError can be raised by local post-parse processing in the later try block, so this can mask runtime bugs by issuing a second LLM call instead of failing normally.

Suggested fix

def _should_retry_with_structured_fallback(exc: Exception) -> bool: """Only fallback for parse/schema drift, not generic provider failures.""" - if isinstance(exc, (ValidationError, ValueError, json.JSONDecodeError, TypeError, AttributeError)): - return True - return False + return isinstance(exc, (ValidationError, ValueError, json.JSONDecodeError))

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def _should_retry_with_structured_fallback(exc: Exception) -> bool:

"""Only fallback for parse/schema drift, not generic provider failures."""

if isinstance(exc, (ValidationError, ValueError, json.JSONDecodeError, TypeError, AttributeError)):

return True

return False

def _should_retry_with_structured_fallback(exc: Exception) -> bool:

"""Only fallback for parse/schema drift, not generic provider failures."""

return isinstance(exc, (ValidationError, ValueError, json.JSONDecodeError))

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/utils/clients.py` around lines 37 - 41, The fallback classifier in _should_retry_with_structured_fallback is too broad (it currently includes TypeError and AttributeError) and can hide runtime bugs by triggering a second LLM call; update the function to only treat parse/schema-related errors as retryable (e.g., keep ValidationError and json.JSONDecodeError and, if needed, ValueError for parsing contexts) and remove TypeError and AttributeError from the exception tuple so local post-parse runtime errors don't cause a structured fallback.

coderabbitai · 2026-04-18T13:10:11Z

+                    logger.warning(
+                        "Structured parse failed for %s/%s; retrying with JSON repair fallback: %s",
+                        provider,
+                        model,
+                        parse_exc,
+                    )


⚠️ Potential issue | 🟠 Major

Avoid logging raw parse exceptions from structured output.

parse_exc can include validation input or parsed user facts, which may leak sensitive deriver content into application logs. Log the exception type and provider/model context instead.

Suggested fix

logger.warning( - "Structured parse failed for %s/%s; retrying with JSON repair fallback: %s", + "Structured parse failed for %s/%s; " + "retrying with JSON repair fallback (%s)", provider, model, - parse_exc, + type(parse_exc).__name__, )

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/utils/clients.py` around lines 2048 - 2053, The current logger.warning call logs the raw parse_exc which may contain sensitive parsed content; update the log to record only the exception type and the provider/model context. In the logger.warning invocation (logger.warning(...)) replace the parse_exc argument with type(parse_exc).__name__ (or parse_exc.__class__.__name__) and remove any raw exception message; keep provider and model in the message so the log shows the exception type and context but not user-derived content.

fix(clients): harden structured-output fallback paths

e874119

coderabbitai Bot reviewed Apr 18, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(clients): harden structured-output fallback paths#577

fix(clients): harden structured-output fallback paths#577
Niko96-dotcom wants to merge 1 commit intoplastic-labs:mainfrom
Niko96-dotcom:fix/provider-json-robustness

Niko96-dotcom commented Apr 18, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 18, 2026 •

edited

Loading

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Apr 18, 2026

Uh oh!

coderabbitai Bot Apr 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Niko96-dotcom commented Apr 18, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test Plan

Why

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented Apr 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 18, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Niko96-dotcom commented Apr 18, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 18, 2026 •

edited

Loading