Fix/506 session bootstrap fails#551
Conversation
…eer search Fixes plastic-labs#520 by effectively allowing global workspace scope for semantic search when using pgvector. Throws an explicit ValidationException if the workspace is paired with external vector databases.
…resolve too_long limit plastic-labs#506
WalkthroughThis PR adds a Changes
Sequence DiagramsequenceDiagram
participant Client as SDK Client
participant API as API Endpoint
participant Summarizer as Summarizer Util
participant CRUD as Message CRUD
participant DB as Database
Client->>API: GET /context?max_messages=10
API->>Summarizer: get_session_context(..., max_messages=10)
Summarizer->>CRUD: get_messages_id_range(..., max_messages=10)
CRUD->>DB: SELECT ... ORDER BY id DESC LIMIT 10
DB-->>CRUD: Recent 10 messages
CRUD->>DB: Wrap in subquery, ORDER BY id ASC
DB-->>CRUD: 10 messages in chronological order
CRUD-->>Summarizer: [Message, ...]
Summarizer-->>API: SessionContext with truncated messages
API-->>Client: 200 OK {context: {...}, messages: [...]}
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
src/crud/document.py (1)
338-339: Consider clarifying when observer/observed are required.The docstring indicates these are optional but doesn't mention they're required for external vector stores. A clarifying note would help API consumers.
Suggested docstring update
- observer: Name of the observing peer - observed: Name of the observed peer + observer: Name of the observing peer (optional for pgvector, required for external stores) + observed: Name of the observed peer (optional for pgvector, required for external stores)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/crud/document.py` around lines 338 - 339, Update the docstring in src/crud/document.py for the parameters observer and observed to state clearly when they are optional and when they are required (e.g., note that they are optional for internal usage but required when persisting to external vector stores or when using cross-peer observation features); mention expected format/constraints for observer and observed values and add a brief example or sentence indicating that callers integrating with external vector stores must supply both fields. Locate the docstring that lists "observer: Name of the observing peer" and "observed: Name of the observed peer" and augment it with this clarifying note so API consumers understand the requirement and any format expectations.src/routers/conclusions.py (1)
9-9: Importing private function breaks encapsulation and creates duplicate validation.
_uses_pgvectoris prefixed with_, indicating it's internal tocrud/document.py. Importing it here creates coupling to implementation details and duplicates the validation that already exists incrud.query_documents()at lines 385-388.Consider either:
- Remove the validation here entirely and let
crud.query_documents()handle it (simpler, single source of truth)- Or expose a public function like
uses_pgvector()if the router genuinely needs this checkOption 1: Remove duplicate validation (preferred)
-from src.crud.document import _uses_pgvector- if not _uses_pgvector(): - if not observer or not observed: - raise ValidationException( - "observer and observed must be specified for semantic search on external vector stores" - ) - documents = await crud.query_documents(Also applies to: 110-114
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/routers/conclusions.py` at line 9, The router imports and calls the internal helper _uses_pgvector, duplicating validation that query_documents already performs; remove the import "from src.crud.document import _uses_pgvector" and delete the duplicate pgvector validation in routers/conclusions.py (including the checks around lines referenced, e.g., the block at 110-114) so query_documents remains the single source of truth; if the router truly needs to perform this check itself, instead add a public function uses_pgvector() in src.crud.document, import that public symbol, and use it (do not import or call any names prefixed with an underscore).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/routers/sessions.py`:
- Around line 687-689: Call to _get_session_context_task passes an unexpected
keyword max_messages causing a TypeError in the no-peer path; either remove the
max_messages argument from the call site or update the helper
_get_session_context_task to accept max_messages: add a parameter max_messages:
Optional[int]=None to _get_session_context_task (and propagate it into its
logic) so both callers work, or conditionally include max_messages only when
peer_target is provided; reference _get_session_context_task and the call site
where summary, messages = await _get_session_context_task(...) to locate and fix
the mismatch.
---
Nitpick comments:
In `@src/crud/document.py`:
- Around line 338-339: Update the docstring in src/crud/document.py for the
parameters observer and observed to state clearly when they are optional and
when they are required (e.g., note that they are optional for internal usage but
required when persisting to external vector stores or when using cross-peer
observation features); mention expected format/constraints for observer and
observed values and add a brief example or sentence indicating that callers
integrating with external vector stores must supply both fields. Locate the
docstring that lists "observer: Name of the observing peer" and "observed: Name
of the observed peer" and augment it with this clarifying note so API consumers
understand the requirement and any format expectations.
In `@src/routers/conclusions.py`:
- Line 9: The router imports and calls the internal helper _uses_pgvector,
duplicating validation that query_documents already performs; remove the import
"from src.crud.document import _uses_pgvector" and delete the duplicate pgvector
validation in routers/conclusions.py (including the checks around lines
referenced, e.g., the block at 110-114) so query_documents remains the single
source of truth; if the router truly needs to perform this check itself, instead
add a public function uses_pgvector() in src.crud.document, import that public
symbol, and use it (do not import or call any names prefixed with an
underscore).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: ded8a4e1-79e2-4616-b1ec-e02c7722e4e5
⛔ Files ignored due to path filters (1)
sdks/typescript/package-lock.jsonis excluded by!**/package-lock.json
📒 Files selected for processing (10)
sdks/python/src/honcho/aio.pysdks/python/src/honcho/session.pysdks/typescript/src/session.tssdks/typescript/src/types/api.tssdks/typescript/src/validation.tssrc/crud/document.pysrc/crud/message.pysrc/routers/conclusions.pysrc/routers/sessions.pysrc/utils/summarizer.py
| summary, messages = await _get_session_context_task( | ||
| db, workspace_id, session_id, token_limit, include_summary | ||
| db, workspace_id, session_id, token_limit, include_summary, max_messages=max_messages | ||
| ) |
There was a problem hiding this comment.
Runtime break: unexpected keyword argument in no-peer context path.
Line 688 passes max_messages into _get_session_context_task, but that helper does not accept this argument. This will raise TypeError whenever peer_target is not provided.
🐛 Proposed fix
async def _get_session_context_task(
db: AsyncSession,
workspace_id: str,
session_id: str,
token_limit: int,
include_summary: bool,
+ max_messages: int | None = None,
) -> tuple[schemas.Summary | None, list[schemas.Message]]:
@@
summary, messages = await summarizer.get_session_context(
db,
workspace_name=workspace_id,
session_name=session_id,
token_limit=token_limit,
include_summary=include_summary,
+ max_messages=max_messages,
)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/routers/sessions.py` around lines 687 - 689, Call to
_get_session_context_task passes an unexpected keyword max_messages causing a
TypeError in the no-peer path; either remove the max_messages argument from the
call site or update the helper _get_session_context_task to accept max_messages:
add a parameter max_messages: Optional[int]=None to _get_session_context_task
(and propagate it into its logic) so both callers work, or conditionally include
max_messages only when peer_target is provided; reference
_get_session_context_task and the call site where summary, messages = await
_get_session_context_task(...) to locate and fix the mismatch.
|
Hey @Pamf1973 — thanks for looking into this! A few things need attention before this can move forward:
|
… max_messages test
There was a problem hiding this comment.
🧹 Nitpick comments (1)
tests/routes/test_sessions.py (1)
868-872: Assert the full returned window, not only the endpoints.This regression is specifically about returning the most recent messages in chronological order. Checking only the first and last message can miss duplicates, gaps, or interior reordering.
Strengthen the ordering assertion
assert "messages" in data assert len(data["messages"]) == 10 - - assert data["messages"][0]["content"] == "Test message 100" - assert data["messages"][-1]["content"] == "Test message 109" + + returned_contents = [message["content"] for message in data["messages"]] + assert returned_contents == [ + f"Test message {i}" for i in range(100, 110) + ]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/routes/test_sessions.py` around lines 868 - 872, The test only checks endpoints and can miss duplicates/gaps; update the assertion to verify the entire returned window is exactly the expected chronological slice. Construct the expected list of message contents for the window (e.g. "Test message 100" through "Test message 109") and assert data["messages"] equals that sequence (or assert [m["content"] for m in data["messages"]] == expected_contents) so the test validates ordering, no duplicates, and no missing messages; use the existing data["messages"] and its "content" field to perform this full equality check.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@tests/routes/test_sessions.py`:
- Around line 868-872: The test only checks endpoints and can miss
duplicates/gaps; update the assertion to verify the entire returned window is
exactly the expected chronological slice. Construct the expected list of message
contents for the window (e.g. "Test message 100" through "Test message 109") and
assert data["messages"] equals that sequence (or assert [m["content"] for m in
data["messages"]] == expected_contents) so the test validates ordering, no
duplicates, and no missing messages; use the existing data["messages"] and its
"content" field to perform this full equality check.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 0c7f1298-32c0-4a0a-8ee6-69255de7cc78
📒 Files selected for processing (1)
tests/routes/test_sessions.py
|
Here is exactly what I did: Reverted document.py and conclusions.py: I checked out both files from main to remove the out-of-scope observer/observed changes, ensuring this PR stays strictly focused on the max_messages fix. |
|
Thanks for the contribution @Pamf1973! We looked into this and don't think this change is needed. Session.context() already accepts a tokens parameter that caps the response at the database level. That's the intended mechanism for this. The issue pointed to LimitSchema in the TS SDK as the culprit, but that's only used for .search() result pagination and is a completely separate code path from context retrieval. Adding max_messages as a secondary cap overlaps with what token budgets already handle. Going to close this along with #506. Appreciate you digging in! |
Resolves #506
The Problem: Currently, session bootstrap fails with a too_long error when a session context attempts to retrieve more than ~100 messages (triggering internal schema list limits inside the underlying structure). This causes clients like OpenClaw to crash when resuming long-running sessions, effectively locking developers out of recovering historical context.
The Solution: This PR establishes a secure "safety valve" by introducing a declarative max_messages parameter to session.context(). This allows clients to cap the exact number of recent messages fetched during context retrieval. The backend intelligently truncates the conversation history at the database level before anything can crash validation, ensuring sessions of any size remain fast, stable, and easily retrievable.
Technical Changes
Backend Persistence: Updated crud.get_messages_id_range() in src/crud/message.py to accept the max_messages parameter. This securely wraps the query in a descending limit subquery mapped ascending, ensuring exactly the most recent requested window is returned chronologically. Plumbed max_messages cleanly backward through the _get_session_context_task and fastAPI routing configurations.
TypeScript SDK: Updated ContextParamsSchema (sdks/typescript/src/validation.ts) to validate numeric options (maxMessages) correctly. Integrated into SessionContextParams and seamlessly bound it into the TypeScript Session.context() mapping logic.
Python SDK: Mirrored the max_messages query param integration accurately onto sdks/python/src/honcho/session.py and the async bindings identically in sdks/python/src/honcho/aio.py.
Testing
✅ Compiled Python SDK and Backend cleanly.
✅ Compiled TypeScript SDK npm run build cleanly.
✅ The database properly limits output based natively on max_messages while perfectly respecting tokens rules.
Summary by CodeRabbit
New Features
max_messagesparameter to session context retrieval, enabling users to cap the number of recent messages included in context (minimum: 1 message). Available across Python SDK, TypeScript SDK, and REST API.Tests