Skip to content

Fix/506 session bootstrap fails#551

Closed
Pamf1973 wants to merge 3 commits intoplastic-labs:mainfrom
Pamf1973:fix/506-session-bootstrap-fails
Closed

Fix/506 session bootstrap fails#551
Pamf1973 wants to merge 3 commits intoplastic-labs:mainfrom
Pamf1973:fix/506-session-bootstrap-fails

Conversation

@Pamf1973
Copy link
Copy Markdown

@Pamf1973 Pamf1973 commented Apr 11, 2026

Resolves #506

The Problem: Currently, session bootstrap fails with a too_long error when a session context attempts to retrieve more than ~100 messages (triggering internal schema list limits inside the underlying structure). This causes clients like OpenClaw to crash when resuming long-running sessions, effectively locking developers out of recovering historical context.

The Solution: This PR establishes a secure "safety valve" by introducing a declarative max_messages parameter to session.context(). This allows clients to cap the exact number of recent messages fetched during context retrieval. The backend intelligently truncates the conversation history at the database level before anything can crash validation, ensuring sessions of any size remain fast, stable, and easily retrievable.

Technical Changes
Backend Persistence: Updated crud.get_messages_id_range() in src/crud/message.py to accept the max_messages parameter. This securely wraps the query in a descending limit subquery mapped ascending, ensuring exactly the most recent requested window is returned chronologically. Plumbed max_messages cleanly backward through the _get_session_context_task and fastAPI routing configurations.
TypeScript SDK: Updated ContextParamsSchema (sdks/typescript/src/validation.ts) to validate numeric options (maxMessages) correctly. Integrated into SessionContextParams and seamlessly bound it into the TypeScript Session.context() mapping logic.
Python SDK: Mirrored the max_messages query param integration accurately onto sdks/python/src/honcho/session.py and the async bindings identically in sdks/python/src/honcho/aio.py.

Testing
✅ Compiled Python SDK and Backend cleanly.
✅ Compiled TypeScript SDK npm run build cleanly.
✅ The database properly limits output based natively on max_messages while perfectly respecting tokens rules.

Summary by CodeRabbit

  • New Features

    • Added optional max_messages parameter to session context retrieval, enabling users to cap the number of recent messages included in context (minimum: 1 message). Available across Python SDK, TypeScript SDK, and REST API.
  • Tests

    • Added test coverage for message limiting functionality in session context queries.

…eer search

Fixes plastic-labs#520 by effectively allowing global workspace scope for semantic search when using pgvector. Throws an explicit ValidationException if the workspace is paired with external vector databases.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 11, 2026

Walkthrough

This PR adds a max_messages parameter across Python and TypeScript SDKs and the backend API, enabling clients to limit the number of messages returned during session context retrieval. The parameter flows from API endpoints through utility functions to the database query layer with proper validation and ordering logic.

Changes

Cohort / File(s) Summary
Python SDK Async & Sync Clients
sdks/python/src/honcho/aio.py, sdks/python/src/honcho/session.py
Added optional max_messages: int | None parameter (with ge=1 constraint) to both SessionAio.context() and Session.context(), wired into query building when not None.
TypeScript SDK Type Definitions & Validation
sdks/typescript/src/types/api.ts, sdks/typescript/src/validation.ts
Added max_messages?: number to SessionContextParams interface and maxMessages field to ContextParamsSchema validation with minimum value of 1.
TypeScript SDK Session Method
sdks/typescript/src/session.ts
Extended Session.context(options) with maxMessages?: number option, parsed into contextParams, and forwarded to internal _getContext method as max_messages query parameter.
Backend Message Retrieval & Routing
src/crud/message.py, src/routers/sessions.py, src/utils/summarizer.py
Implemented max_messages parameter in get_messages_id_range to cap results by ordering descending, applying LIMIT, then wrapping in subquery for ascending order; updated session context endpoint and summarizer utility to accept and forward max_messages.
Tests
tests/routes/test_sessions.py
Added test_get_session_context_with_max_messages verifying correct truncation and message ordering when max_messages=10 is specified.

Sequence Diagram

sequenceDiagram
    participant Client as SDK Client
    participant API as API Endpoint
    participant Summarizer as Summarizer Util
    participant CRUD as Message CRUD
    participant DB as Database

    Client->>API: GET /context?max_messages=10
    API->>Summarizer: get_session_context(..., max_messages=10)
    Summarizer->>CRUD: get_messages_id_range(..., max_messages=10)
    CRUD->>DB: SELECT ... ORDER BY id DESC LIMIT 10
    DB-->>CRUD: Recent 10 messages
    CRUD->>DB: Wrap in subquery, ORDER BY id ASC
    DB-->>CRUD: 10 messages in chronological order
    CRUD-->>Summarizer: [Message, ...]
    Summarizer-->>API: SessionContext with truncated messages
    API-->>Client: 200 OK {context: {...}, messages: [...]}
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Possibly related PRs

Suggested reviewers

  • VVoruganti
  • dr-frmr
  • Rajat-Ahuja1997

Poem

🐰 A hop, skip, and query away,
Our bunny SDKs now say:
"Max messages, please constrain!"
No more overflows of pain,
Context flows in bounded way! 🎯

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Title check ❓ Inconclusive The title 'Fix/506 session bootstrap fails' references issue #506 but does not clearly describe the main change (adding max_messages parameter to prevent bootstrap failures). Consider revising the title to be more descriptive, such as 'Add max_messages parameter to session.context() to fix bootstrap failures' or similar, so it clearly explains what the fix does.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check ✅ Passed The PR successfully implements all coding requirements from #506: adds max_messages parameter to session.context() across Python/TypeScript SDKs and backend, implements server-side message limiting in crud.get_messages_id_range(), integrates through FastAPI routes and all client SDKs, and includes a test validating max_messages behavior.
Out of Scope Changes check ✅ Passed Based on the raw_summary provided, all changes are directly related to implementing the max_messages feature for #506 across backend, TypeScript SDK, Python SDK, and tests. No out-of-scope changes are evident in the file summaries.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
src/crud/document.py (1)

338-339: Consider clarifying when observer/observed are required.

The docstring indicates these are optional but doesn't mention they're required for external vector stores. A clarifying note would help API consumers.

Suggested docstring update
-        observer: Name of the observing peer
-        observed: Name of the observed peer
+        observer: Name of the observing peer (optional for pgvector, required for external stores)
+        observed: Name of the observed peer (optional for pgvector, required for external stores)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/crud/document.py` around lines 338 - 339, Update the docstring in
src/crud/document.py for the parameters observer and observed to state clearly
when they are optional and when they are required (e.g., note that they are
optional for internal usage but required when persisting to external vector
stores or when using cross-peer observation features); mention expected
format/constraints for observer and observed values and add a brief example or
sentence indicating that callers integrating with external vector stores must
supply both fields. Locate the docstring that lists "observer: Name of the
observing peer" and "observed: Name of the observed peer" and augment it with
this clarifying note so API consumers understand the requirement and any format
expectations.
src/routers/conclusions.py (1)

9-9: Importing private function breaks encapsulation and creates duplicate validation.

_uses_pgvector is prefixed with _, indicating it's internal to crud/document.py. Importing it here creates coupling to implementation details and duplicates the validation that already exists in crud.query_documents() at lines 385-388.

Consider either:

  1. Remove the validation here entirely and let crud.query_documents() handle it (simpler, single source of truth)
  2. Or expose a public function like uses_pgvector() if the router genuinely needs this check
Option 1: Remove duplicate validation (preferred)
-from src.crud.document import _uses_pgvector
-    if not _uses_pgvector():
-        if not observer or not observed:
-            raise ValidationException(
-                "observer and observed must be specified for semantic search on external vector stores"
-            )
-
     documents = await crud.query_documents(

Also applies to: 110-114

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/routers/conclusions.py` at line 9, The router imports and calls the
internal helper _uses_pgvector, duplicating validation that query_documents
already performs; remove the import "from src.crud.document import
_uses_pgvector" and delete the duplicate pgvector validation in
routers/conclusions.py (including the checks around lines referenced, e.g., the
block at 110-114) so query_documents remains the single source of truth; if the
router truly needs to perform this check itself, instead add a public function
uses_pgvector() in src.crud.document, import that public symbol, and use it (do
not import or call any names prefixed with an underscore).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/routers/sessions.py`:
- Around line 687-689: Call to _get_session_context_task passes an unexpected
keyword max_messages causing a TypeError in the no-peer path; either remove the
max_messages argument from the call site or update the helper
_get_session_context_task to accept max_messages: add a parameter max_messages:
Optional[int]=None to _get_session_context_task (and propagate it into its
logic) so both callers work, or conditionally include max_messages only when
peer_target is provided; reference _get_session_context_task and the call site
where summary, messages = await _get_session_context_task(...) to locate and fix
the mismatch.

---

Nitpick comments:
In `@src/crud/document.py`:
- Around line 338-339: Update the docstring in src/crud/document.py for the
parameters observer and observed to state clearly when they are optional and
when they are required (e.g., note that they are optional for internal usage but
required when persisting to external vector stores or when using cross-peer
observation features); mention expected format/constraints for observer and
observed values and add a brief example or sentence indicating that callers
integrating with external vector stores must supply both fields. Locate the
docstring that lists "observer: Name of the observing peer" and "observed: Name
of the observed peer" and augment it with this clarifying note so API consumers
understand the requirement and any format expectations.

In `@src/routers/conclusions.py`:
- Line 9: The router imports and calls the internal helper _uses_pgvector,
duplicating validation that query_documents already performs; remove the import
"from src.crud.document import _uses_pgvector" and delete the duplicate pgvector
validation in routers/conclusions.py (including the checks around lines
referenced, e.g., the block at 110-114) so query_documents remains the single
source of truth; if the router truly needs to perform this check itself, instead
add a public function uses_pgvector() in src.crud.document, import that public
symbol, and use it (do not import or call any names prefixed with an
underscore).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: ded8a4e1-79e2-4616-b1ec-e02c7722e4e5

📥 Commits

Reviewing files that changed from the base of the PR and between 58f9abb and 5ef2ee0.

⛔ Files ignored due to path filters (1)
  • sdks/typescript/package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (10)
  • sdks/python/src/honcho/aio.py
  • sdks/python/src/honcho/session.py
  • sdks/typescript/src/session.ts
  • sdks/typescript/src/types/api.ts
  • sdks/typescript/src/validation.ts
  • src/crud/document.py
  • src/crud/message.py
  • src/routers/conclusions.py
  • src/routers/sessions.py
  • src/utils/summarizer.py

Comment thread src/routers/sessions.py
Comment on lines 687 to 689
summary, messages = await _get_session_context_task(
db, workspace_id, session_id, token_limit, include_summary
db, workspace_id, session_id, token_limit, include_summary, max_messages=max_messages
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Runtime break: unexpected keyword argument in no-peer context path.

Line 688 passes max_messages into _get_session_context_task, but that helper does not accept this argument. This will raise TypeError whenever peer_target is not provided.

🐛 Proposed fix
 async def _get_session_context_task(
     db: AsyncSession,
     workspace_id: str,
     session_id: str,
     token_limit: int,
     include_summary: bool,
+    max_messages: int | None = None,
 ) -> tuple[schemas.Summary | None, list[schemas.Message]]:
@@
     summary, messages = await summarizer.get_session_context(
         db,
         workspace_name=workspace_id,
         session_name=session_id,
         token_limit=token_limit,
         include_summary=include_summary,
+        max_messages=max_messages,
     )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/routers/sessions.py` around lines 687 - 689, Call to
_get_session_context_task passes an unexpected keyword max_messages causing a
TypeError in the no-peer path; either remove the max_messages argument from the
call site or update the helper _get_session_context_task to accept max_messages:
add a parameter max_messages: Optional[int]=None to _get_session_context_task
(and propagate it into its logic) so both callers work, or conditionally include
max_messages only when peer_target is provided; reference
_get_session_context_task and the call site where summary, messages = await
_get_session_context_task(...) to locate and fix the mismatch.

@ajspig
Copy link
Copy Markdown
Contributor

ajspig commented Apr 14, 2026

Hey @Pamf1973 — thanks for looking into this! A few things need attention before this can move forward:

  1. Out-of-scope changes: The document.py and conclusions.py changes (making observer/observed optional for pgvector queries) are unrelated to the max_messages feature and Session bootstrap fails with too_long when session exceeds ~100 messages #506. That's a separate concern being tracked in Cross-peer semantic search for conclusions #520, and @VVoruganti has flagged architectural considerations around it. Please remove those changes and keep this PR focused on max_messages only.
  2. package-lock.json: The +245 line diff shouldn't be in here — that's likely from running npm install locally.
  3. Testing: The PR description says it compiles cleanly but doesn't include actual test cases. Could you add a test that exercises max_messages on a session with >100 messages to validate the fix against Session bootstrap fails with too_long when session exceeds ~100 messages #506?

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
tests/routes/test_sessions.py (1)

868-872: Assert the full returned window, not only the endpoints.

This regression is specifically about returning the most recent messages in chronological order. Checking only the first and last message can miss duplicates, gaps, or interior reordering.

Strengthen the ordering assertion
     assert "messages" in data
     assert len(data["messages"]) == 10
-    
-    assert data["messages"][0]["content"] == "Test message 100"
-    assert data["messages"][-1]["content"] == "Test message 109"
+
+    returned_contents = [message["content"] for message in data["messages"]]
+    assert returned_contents == [
+        f"Test message {i}" for i in range(100, 110)
+    ]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/routes/test_sessions.py` around lines 868 - 872, The test only checks
endpoints and can miss duplicates/gaps; update the assertion to verify the
entire returned window is exactly the expected chronological slice. Construct
the expected list of message contents for the window (e.g. "Test message 100"
through "Test message 109") and assert data["messages"] equals that sequence (or
assert [m["content"] for m in data["messages"]] == expected_contents) so the
test validates ordering, no duplicates, and no missing messages; use the
existing data["messages"] and its "content" field to perform this full equality
check.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@tests/routes/test_sessions.py`:
- Around line 868-872: The test only checks endpoints and can miss
duplicates/gaps; update the assertion to verify the entire returned window is
exactly the expected chronological slice. Construct the expected list of message
contents for the window (e.g. "Test message 100" through "Test message 109") and
assert data["messages"] equals that sequence (or assert [m["content"] for m in
data["messages"]] == expected_contents) so the test validates ordering, no
duplicates, and no missing messages; use the existing data["messages"] and its
"content" field to perform this full equality check.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 0c7f1298-32c0-4a0a-8ee6-69255de7cc78

📥 Commits

Reviewing files that changed from the base of the PR and between 5ef2ee0 and ef94f78.

📒 Files selected for processing (1)
  • tests/routes/test_sessions.py

@Pamf1973
Copy link
Copy Markdown
Author

Here is exactly what I did:

Reverted document.py and conclusions.py: I checked out both files from main to remove the out-of-scope observer/observed changes, ensuring this PR stays strictly focused on the max_messages fix.
Removed package-lock.json: I deleted the sdks/typescript/package-lock.json file which introduced the +245 line diff originally spawned by a local npm install.
Added Test Case: I added a new test test_get_session_context_with_max_messages in tests/routes/test_sessions.py which:
Sets up a session.
Pushes 110 messages sequentially.
Validates that max_messages=10 query successfully constrains the returned result down to 10 context messages while keeping chronological ordering for the most recent items.

@ajspig
Copy link
Copy Markdown
Contributor

ajspig commented Apr 28, 2026

Thanks for the contribution @Pamf1973! We looked into this and don't think this change is needed.
The too_long error from #506 doesn't originate in Honcho. There's no such error anywhere in the codebase. It was almost certainly an LLM provider error (context window exceeded) on the consumer side. We tested against a session with 1739 messages using session.context() with --tokens 2000 and it returned correctly without issue.

Session.context() already accepts a tokens parameter that caps the response at the database level. That's the intended mechanism for this. The issue pointed to LimitSchema in the TS SDK as the culprit, but that's only used for .search() result pagination and is a completely separate code path from context retrieval. Adding max_messages as a secondary cap overlaps with what token budgets already handle. Going to close this along with #506. Appreciate you digging in!

@ajspig ajspig closed this Apr 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Session bootstrap fails with too_long when session exceeds ~100 messages

2 participants