Skip to content

fix: harden deriver prompt for strict PromptRepresentation output#562

Closed
freemanconsulting wants to merge 9 commits intoplastic-labs:mainfrom
Freeman-Consulting:fix/deriver-prompt-schema-alignment
Closed

fix: harden deriver prompt for strict PromptRepresentation output#562
freemanconsulting wants to merge 9 commits intoplastic-labs:mainfrom
Freeman-Consulting:fix/deriver-prompt-schema-alignment

Conversation

@freemanconsulting
Copy link
Copy Markdown

@freemanconsulting freemanconsulting commented Apr 14, 2026

Summary

  • align the minimal deriver prompt wording with the PromptRepresentation schema
  • explicitly require explicit[].content and forbid drift to fact / source keys
  • add a regression test that guards the prompt contract wording

Why

Local/open-weight models were drifting toward fact-shaped objects because the prompt semantics emphasized facts/conclusions while the schema required content. This change makes the textual contract reinforce the schema instead of fighting it.

Validation

  • reproduced the old failure shape locally (explicit[].fact -> validation failure -> empty representation)
  • verified the patched prompt returns valid explicit[].content objects in direct structured calls
  • verified a live temp workspace/session through the API now stores explicit observations with the local deriver stack
  • ran a focused prompt-contract check locally to confirm the prompt contains the required content instructions and exact JSON example

Summary by CodeRabbit

  • New Features

    • Updated output structure to emit JSON-formatted explicit observations with standardized content fields.
  • Tests

    • Added unit tests to validate prompt behavior and output formatting.
  • Chores

    • Optimized Docker build context to exclude unnecessary directories.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 14, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Updates the deriver prompt to define extracted items as "explicit atomic observations" with structured JSON output, introduces an OUTPUT CONTRACT constraint requiring objects with only content keys, and adds test coverage validating the prompt's output format and example handling.

Changes

Cohort / File(s) Summary
Deriver Prompt Specification
src/deriver/prompts.py
Restructured minimal deriver prompt to output observations as JSON objects with content key, added OUTPUT CONTRACT defining exact JSON format constraints, reformatted examples to match new structure.
Deriver Prompt Tests
tests/deriver/test_prompts.py
Added new test module with two unit tests validating that the prompt enforces content key usage, forbids disallowed keys, and ensures example observations remain literal without substitution.
Docker Build Configuration
.dockerignore
Excluded .git and tests directories from Docker image build context.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Poem

🐰 A prompt so precise, with content so tight,
Each observation atomic, formatted just right,
JSON contracts now guard what we sow,
Tests verify examples don't steal the show!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 36.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly addresses the main change: hardening the deriver prompt to enforce strict PromptRepresentation output with explicit content keys, which is the core objective of this PR.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/deriver/prompts.py`:
- Around line 53-55: The EXPLICIT examples in src/deriver/prompts.py include
inferential outputs ("{peer_id} lives in NYC" and "completed high school or
equivalent") that violate the “directly derived observations only” rule; update
the EXPLICIT example set (the lines containing the examples with "I took my dog
for a walk in NYC" and "{peer_id} attended college") to remove any inferred
facts and replace them with only literal observations directly supported by the
input (e.g., keep "has a dog" for the dog-walk example but remove or change
"lives in NYC" to something like "was in NYC" and remove the high-school
inference, ensuring the EXPLICIT examples in prompts.py contain no inferred
generalizations).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 815fecdb-93c9-4ee2-a788-253a68e15442

📥 Commits

Reviewing files that changed from the base of the PR and between 58f9abb and 6184e3a.

📒 Files selected for processing (2)
  • src/deriver/prompts.py
  • tests/deriver/test_prompts.py

Comment thread src/deriver/prompts.py Outdated
@freemanconsulting freemanconsulting force-pushed the fix/deriver-prompt-schema-alignment branch from 6184e3a to 9e8bfd6 Compare April 14, 2026 09:52
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Nitpick comments (2)
src/models.py (1)

30-30: Prefer absolute import for the newly added settings dependency.

Line 30 introduces a relative import for settings. Consider from src.config import settings to match project import preference.

As per coding guidelines "Follow isort conventions with absolute imports preferred".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/models.py` at line 30, Replace the relative import "from .config import
settings" with an absolute import to follow project conventions; update the
import statement where models.py imports settings to use the module path
starting from the package root (e.g., import settings from src.config) so
references to the settings symbol remain the same and isort/linters will accept
it.
Dockerfile (1)

42-42: Use uv run python for Python command consistency.

Line 42 currently invokes python directly. Using uv run python keeps command behavior aligned with repo convention and avoids relying on PATH ordering.

Based on learnings: Always use uv run or uv to prefix any commands related to Python to ensure virtual environment is used.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Dockerfile` at line 42, Change the Dockerfile RUN that invokes python
directly to use the repo's virtual-env wrapper by prefixing with "uv run"
(replace the existing "RUN python - <<'PY'" invocation with "RUN uv run python -
<<'PY'") so the container uses the project's virtual environment consistently;
ensure any subsequent heredoc content remains unchanged and that the "uv"
wrapper is installed/available in the build environment before this RUN step.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@Dockerfile`:
- Line 34: The Dockerfile uses a broad COPY . /app which pulls everything into
the image; update the .dockerignore to add entries for ".git" and "tests" so git
metadata and test artifacts are excluded from the build context and image.
Locate the existing .dockerignore file and append the two lines ".git" and
"tests" (one per line), then rebuild to verify that COPY . /app no longer
includes those paths.
- Around line 42-53: The download code uses url, cache_key and cache_path and
currently lacks a network timeout and integrity checks; modify the inline Python
to call urllib.request.urlopen(url, timeout=10) (or use urllib.request.Request +
context) and compute a cryptographic checksum (e.g. sha256) of the downloaded
bytes, comparing it against a hard-coded expected value before writing to
cache_path; perform the download into a temporary file (use
tempfile.NamedTemporaryFile or write to cache_path.with_suffix('.tmp')), verify
checksum, then atomically replace with os.replace so corrupted/partial files
never become final, and log and abort (do not write cache) on timeout or
checksum mismatch.

In `@src/deriver/queue_manager.py`:
- Around line 120-123: The except block after calling
self.dream_scheduler.recover_overdue_dreams() should preserve the traceback and
send the error to Sentry: replace the simple logger.warning("Failed overdue
dream recovery at startup: %s", e) with exception-style logging that includes
the stack (e.g., logger.exception or logger.warning(..., exc_info=True)) and
call the same Sentry reporting helper used elsewhere (e.g.,
sentry_sdk.capture_exception or the project's capture/report helper) so
recover_overdue_dreams, the startup recovery block, and the exception are all
logged with full stack context for debugging.

In `@src/dreamer/dream_scheduler.py`:
- Around line 263-298: Wrap the per-row and per-dream-type work in try/except so
one bad recovery row or type doesn't abort the whole recovery: around the logic
that converts last_dream_document_count (int(last_dream_document_count or 0)),
constructs DreamType(dream_type), and calls await self.execute_dream(...), catch
exceptions, log an error with context (workspace_name, observer, observed,
dream_type, and the exception), increment no global counters that would stop
processing, and continue to the next dream_type or next row; ensure invalid
DreamType values are handled gracefully and that failures in execute_dream do
not break the outer loop so remaining rows in rows are still processed.

In `@src/models.py`:
- Around line 282-283: The model's embedding column (embedding:
MappedColumn[Any] = mapped_column(Vector(settings.VECTOR_STORE.DIMENSIONS),
nullable=True)) and the embedding client (output_dimensionality: 1536 in
src/embedding_client.py) must stay in sync with
settings.VECTOR_STORE.DIMENSIONS; update src/embedding_client.py to read and use
settings.VECTOR_STORE.DIMENSIONS instead of hardcoded 1536, and add a database
migration that alters the Vector(...) column type in the ORM to the new
dimension when VECTOR_STORE.DIMENSIONS changes so stored vectors and similarity
queries match the configured dimensionality.

In `@tests/dreamer/test_dream_scheduler.py`:
- Around line 505-515: The test is non-hermetic because recover_overdue_dreams()
iterates over settings.DREAM.ENABLED_TYPES; pin the enabled types in this test
so it only runs once. Update the test to patch settings.DREAM.ENABLED_TYPES
(alongside the existing patches) to a single-element list matching the dream
type the test expects, ensuring recover_overdue_dreams() yields one recovery and
execute_dream.assert_awaited_once_with(...) continues to be valid for the single
enabled type.

---

Nitpick comments:
In `@Dockerfile`:
- Line 42: Change the Dockerfile RUN that invokes python directly to use the
repo's virtual-env wrapper by prefixing with "uv run" (replace the existing "RUN
python - <<'PY'" invocation with "RUN uv run python - <<'PY'") so the container
uses the project's virtual environment consistently; ensure any subsequent
heredoc content remains unchanged and that the "uv" wrapper is
installed/available in the build environment before this RUN step.

In `@src/models.py`:
- Line 30: Replace the relative import "from .config import settings" with an
absolute import to follow project conventions; update the import statement where
models.py imports settings to use the module path starting from the package root
(e.g., import settings from src.config) so references to the settings symbol
remain the same and isort/linters will accept it.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 3ac2ad82-69bb-4ffa-921d-d5590376c94c

📥 Commits

Reviewing files that changed from the base of the PR and between 169444f and 83d4b1a.

📒 Files selected for processing (9)
  • .dockerignore
  • Dockerfile
  • src/deriver/queue_manager.py
  • src/dreamer/dream_scheduler.py
  • src/dreamer/specialists.py
  • src/models.py
  • src/utils/agent_tools.py
  • tests/dreamer/test_dream_scheduler.py
  • tests/dreamer/test_specialists.py
✅ Files skipped from review due to trivial changes (1)
  • .dockerignore

Comment thread Dockerfile
ENV PATH="/app/.venv/bin:$PATH"

# Sync the project into a virtual environment, using the frozen lockfile
COPY . /app
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== .dockerignore =="
if [ -f .dockerignore ]; then
  nl -ba .dockerignore
else
  echo "Missing .dockerignore"
fi

echo
echo "== Potentially sensitive or high-churn entries in repo root =="
fd -HI -d 1 '^(\\.git|\\.env.*|\\.venv|node_modules|tests|docs|dist|build)$' .

Repository: plastic-labs/honcho

Length of output: 142


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== .dockerignore =="
if [ -f .dockerignore ]; then
  cat .dockerignore
else
  echo "Missing .dockerignore"
fi

echo ""
echo "== Potentially sensitive or high-churn entries in repo root =="
for pattern in '.git' '.env*' '.venv' 'node_modules' 'tests' 'docs' 'dist' 'build'; do
  if [ -e "$pattern" ] 2>/dev/null; then
    echo "Found: $pattern"
  fi
done

Repository: plastic-labs/honcho

Length of output: 345


Add .git and tests to .dockerignore to prevent unnecessary bloat and follow security best practices.

The .dockerignore file exists and covers most sensitive paths (.env, .env.template, __pycache__, .venv), but is missing explicit exclusions for:

  • .git/ — Version control metadata can leak information and significantly increases image size
  • tests/ — Test artifacts are unnecessary in production images

Add these two patterns to .dockerignore:

.git
tests
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Dockerfile` at line 34, The Dockerfile uses a broad COPY . /app which pulls
everything into the image; update the .dockerignore to add entries for ".git"
and "tests" so git metadata and test artifacts are excluded from the build
context and image. Locate the existing .dockerignore file and append the two
lines ".git" and "tests" (one per line), then rebuild to verify that COPY . /app
no longer includes those paths.

Comment thread Dockerfile
Comment on lines +42 to +53
RUN python - <<'PY'
import hashlib
import pathlib
import urllib.request
url = 'https://openaipublic.blob.core.windows.net/encodings/o200k_base.tiktoken'
cache_key = hashlib.sha1(url.encode()).hexdigest()
cache_path = pathlib.Path('/tmp/data-gym-cache') / cache_key
cache_path.parent.mkdir(parents=True, exist_ok=True)
if not cache_path.exists():
with urllib.request.urlopen(url) as resp:
cache_path.write_bytes(resp.read())
print(f'cached {url} -> {cache_path}')
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Harden the preseed download with timeout and checksum verification.

Lines 51-52 download and trust remote bytes without a timeout or integrity check. A transient hang or corrupted payload can poison cache used at startup by tokenizer initialization (src/utils/tokens.py, src/embedding_client.py, src/schemas/api.py).

🔧 Suggested hardening patch
 RUN python - <<'PY'
 import hashlib
 import pathlib
 import urllib.request
+import tempfile
+
 url = 'https://openaipublic.blob.core.windows.net/encodings/o200k_base.tiktoken'
+expected_sha256 = '<pin-from-trusted-source>'
 cache_key = hashlib.sha1(url.encode()).hexdigest()
 cache_path = pathlib.Path('/tmp/data-gym-cache') / cache_key
 cache_path.parent.mkdir(parents=True, exist_ok=True)
 if not cache_path.exists():
-    with urllib.request.urlopen(url) as resp:
-        cache_path.write_bytes(resp.read())
+    with urllib.request.urlopen(url, timeout=30) as resp:
+        data = resp.read()
+    digest = hashlib.sha256(data).hexdigest()
+    if digest != expected_sha256:
+        raise RuntimeError(f'Checksum mismatch for {url}: {digest}')
+    tmp = pathlib.Path(tempfile.mkstemp(dir=str(cache_path.parent))[1])
+    tmp.write_bytes(data)
+    tmp.replace(cache_path)
 print(f'cached {url} -> {cache_path}')
 PY
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Dockerfile` around lines 42 - 53, The download code uses url, cache_key and
cache_path and currently lacks a network timeout and integrity checks; modify
the inline Python to call urllib.request.urlopen(url, timeout=10) (or use
urllib.request.Request + context) and compute a cryptographic checksum (e.g.
sha256) of the downloaded bytes, comparing it against a hard-coded expected
value before writing to cache_path; perform the download into a temporary file
(use tempfile.NamedTemporaryFile or write to cache_path.with_suffix('.tmp')),
verify checksum, then atomically replace with os.replace so corrupted/partial
files never become final, and log and abort (do not write cache) on timeout or
checksum mismatch.

Comment on lines +120 to +123
try:
recovered_dreams = await self.dream_scheduler.recover_overdue_dreams()
except Exception as e:
logger.warning("Failed overdue dream recovery at startup: %s", e)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Keep the traceback for startup recovery failures.

This path currently logs only str(e), so if overdue-dream recovery partially fails you lose the stack needed to identify the bad row or dream type. Keeping startup non-fatal is fine, but this should use exception logging here, and ideally report to Sentry like the other init/polling error paths.

Suggested change
-        except Exception as e:
-            logger.warning("Failed overdue dream recovery at startup: %s", e)
+        except Exception as e:
+            logger.exception("Failed overdue dream recovery at startup")
+            if settings.SENTRY.ENABLED:
+                sentry_sdk.capture_exception(e)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
try:
recovered_dreams = await self.dream_scheduler.recover_overdue_dreams()
except Exception as e:
logger.warning("Failed overdue dream recovery at startup: %s", e)
try:
recovered_dreams = await self.dream_scheduler.recover_overdue_dreams()
except Exception as e:
logger.exception("Failed overdue dream recovery at startup")
if settings.SENTRY.ENABLED:
sentry_sdk.capture_exception(e)
🧰 Tools
🪛 Ruff (0.15.10)

[warning] 122-122: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/deriver/queue_manager.py` around lines 120 - 123, The except block after
calling self.dream_scheduler.recover_overdue_dreams() should preserve the
traceback and send the error to Sentry: replace the simple
logger.warning("Failed overdue dream recovery at startup: %s", e) with
exception-style logging that includes the stack (e.g., logger.exception or
logger.warning(..., exc_info=True)) and call the same Sentry reporting helper
used elsewhere (e.g., sentry_sdk.capture_exception or the project's
capture/report helper) so recover_overdue_dreams, the startup recovery block,
and the exception are all logged with full stack context for debugging.

Comment on lines +263 to +298
for (
workspace_name,
observer,
observed,
doc_count,
last_dream_document_count,
last_dream_at,
last_document_at,
) in rows:
if doc_count < settings.DREAM.DOCUMENT_THRESHOLD:
continue
if last_document_at is None or last_document_at > idle_cutoff:
continue

last_dream_document_count_int = int(last_dream_document_count or 0)
if doc_count - last_dream_document_count_int < settings.DREAM.DOCUMENT_THRESHOLD:
continue

if last_dream_at:
try:
last_dream_dt = datetime.fromisoformat(last_dream_at)
except (TypeError, ValueError):
last_dream_dt = None
if last_dream_dt is not None:
hours_since_last_dream = (now - last_dream_dt).total_seconds() / 3600
if hours_since_last_dream < settings.DREAM.MIN_HOURS_BETWEEN_DREAMS:
continue

for dream_type in settings.DREAM.ENABLED_TYPES:
await self.execute_dream(
workspace_name,
DreamType(dream_type),
observer=observer,
observed=observed,
)
recovered += 1
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don’t let one bad recovery row abort the rest of startup recovery.

Any failure in int(last_dream_document_count), DreamType(dream_type), or execute_dream(...) exits the whole method. Since QueueManager.initialize() catches that once and keeps booting, the remaining overdue collections are silently skipped. Recovery should handle failures per row/type, log them, and continue.

Suggested change
         for (
             workspace_name,
             observer,
             observed,
             doc_count,
             last_dream_document_count,
             last_dream_at,
             last_document_at,
         ) in rows:
-            if doc_count < settings.DREAM.DOCUMENT_THRESHOLD:
-                continue
-            if last_document_at is None or last_document_at > idle_cutoff:
-                continue
-
-            last_dream_document_count_int = int(last_dream_document_count or 0)
-            if doc_count - last_dream_document_count_int < settings.DREAM.DOCUMENT_THRESHOLD:
-                continue
-
-            if last_dream_at:
-                try:
-                    last_dream_dt = datetime.fromisoformat(last_dream_at)
-                except (TypeError, ValueError):
-                    last_dream_dt = None
-                if last_dream_dt is not None:
-                    hours_since_last_dream = (now - last_dream_dt).total_seconds() / 3600
-                    if hours_since_last_dream < settings.DREAM.MIN_HOURS_BETWEEN_DREAMS:
-                        continue
-
-            for dream_type in settings.DREAM.ENABLED_TYPES:
-                await self.execute_dream(
-                    workspace_name,
-                    DreamType(dream_type),
-                    observer=observer,
-                    observed=observed,
-                )
-                recovered += 1
+            try:
+                if doc_count < settings.DREAM.DOCUMENT_THRESHOLD:
+                    continue
+                if last_document_at is None or last_document_at > idle_cutoff:
+                    continue
+
+                last_dream_document_count_int = int(last_dream_document_count or 0)
+                if (
+                    doc_count - last_dream_document_count_int
+                    < settings.DREAM.DOCUMENT_THRESHOLD
+                ):
+                    continue
+
+                if last_dream_at:
+                    try:
+                        last_dream_dt = datetime.fromisoformat(last_dream_at)
+                    except (TypeError, ValueError):
+                        last_dream_dt = None
+                    if last_dream_dt is not None:
+                        hours_since_last_dream = (
+                            now - last_dream_dt
+                        ).total_seconds() / 3600
+                        if (
+                            hours_since_last_dream
+                            < settings.DREAM.MIN_HOURS_BETWEEN_DREAMS
+                        ):
+                            continue
+
+                for dream_type in settings.DREAM.ENABLED_TYPES:
+                    try:
+                        await self.execute_dream(
+                            workspace_name,
+                            DreamType(dream_type),
+                            observer=observer,
+                            observed=observed,
+                        )
+                    except Exception:
+                        logger.exception(
+                            "Failed recovering overdue dream",
+                            extra={
+                                "workspace_name": workspace_name,
+                                "observer": observer,
+                                "observed": observed,
+                                "dream_type": dream_type,
+                            },
+                        )
+                    else:
+                        recovered += 1
+            except (TypeError, ValueError):
+                logger.exception(
+                    "Invalid overdue dream metadata during recovery",
+                    extra={
+                        "workspace_name": workspace_name,
+                        "observer": observer,
+                        "observed": observed,
+                    },
+                )
+                continue
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/dreamer/dream_scheduler.py` around lines 263 - 298, Wrap the per-row and
per-dream-type work in try/except so one bad recovery row or type doesn't abort
the whole recovery: around the logic that converts last_dream_document_count
(int(last_dream_document_count or 0)), constructs DreamType(dream_type), and
calls await self.execute_dream(...), catch exceptions, log an error with context
(workspace_name, observer, observed, dream_type, and the exception), increment
no global counters that would stop processing, and continue to the next
dream_type or next row; ensure invalid DreamType values are handled gracefully
and that failures in execute_dream do not break the outer loop so remaining rows
in rows are still processed.

Comment thread src/models.py
Comment on lines +282 to 283
embedding: MappedColumn[Any] = mapped_column(Vector(settings.VECTOR_STORE.DIMENSIONS), nullable=True)
message_id: Mapped[str] = mapped_column(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== Model vector type definitions =="
rg -nP 'Vector\(\s*settings\.VECTOR_STORE\.DIMENSIONS|Vector\(\s*1536\s*\)' src/models.py

echo
echo "== Migration vector type definitions =="
rg -nP 'Vector\(\s*1536\s*\)|alter_column\([^)]*embedding' migrations/versions

echo
echo "== Embedding output dimensionality sources =="
rg -nP 'output_dimensionality|VECTOR_STORE\.DIMENSIONS|OPENAI_EMBEDDING_MODEL' src/embedding_client.py src/config.py

Repository: plastic-labs/honcho

Length of output: 1477


🏁 Script executed:

#!/bin/bash

echo "== Line 30 of src/models.py (import statement) =="
sed -n '28,32p' src/models.py

echo
echo "== VECTOR_STORE.DIMENSIONS in config =="
rg -nA 5 'VECTOR_STORE' src/config.py | head -20

echo
echo "== Check if absolute import path exists (src/config.py) =="
ls -la src/config.py 2>&1 | head -5

Repository: plastic-labs/honcho

Length of output: 1025


🏁 Script executed:

#!/bin/bash

echo "== VECTOR_STORE.DIMENSIONS definition =="
rg -nA 30 'class VectorStoreSettings' src/config.py | grep -i dimensions

echo
echo "== Full VectorStoreSettings class (first 50 lines) =="
rg -nA 50 'class VectorStoreSettings' src/config.py | head -60

Repository: plastic-labs/honcho

Length of output: 2092


Coordinate vector dimension changes across ORM, migrations, and embedding generation.

Models at lines 282 and 390 now reference settings.VECTOR_STORE.DIMENSIONS, which defaults to 1536 but can be overridden via environment variable. However, existing migrations hardcode Vector(1536) in column definitions, and the embedding client hardcodes output_dimensionality: 1536. If VECTOR_STORE.DIMENSIONS is changed at runtime, vector insertion and similarity search queries will fail due to dimension mismatch against the fixed vector(1536) schema.

Ensure that any change to VECTOR_STORE.DIMENSIONS is coordinated with:

  • A migration to alter column types to the new dimension
  • Updates to src/embedding_client.py to use the configured dimension
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/models.py` around lines 282 - 283, The model's embedding column
(embedding: MappedColumn[Any] =
mapped_column(Vector(settings.VECTOR_STORE.DIMENSIONS), nullable=True)) and the
embedding client (output_dimensionality: 1536 in src/embedding_client.py) must
stay in sync with settings.VECTOR_STORE.DIMENSIONS; update
src/embedding_client.py to read and use settings.VECTOR_STORE.DIMENSIONS instead
of hardcoded 1536, and add a database migration that alters the Vector(...)
column type in the ORM to the new dimension when VECTOR_STORE.DIMENSIONS changes
so stored vectors and similarity queries match the configured dimensionality.

Comment on lines +505 to +515
with (
patch("src.dreamer.dream_scheduler.tracked_db", mock_tracked_db),
patch.object(dream_scheduler, "execute_dream", new_callable=AsyncMock) as execute_dream,
patch("src.dreamer.dream_scheduler.settings.DREAM.DOCUMENT_THRESHOLD", 50),
patch("src.dreamer.dream_scheduler.settings.DREAM.IDLE_TIMEOUT_MINUTES", 60),
patch("src.dreamer.dream_scheduler.settings.DREAM.MIN_HOURS_BETWEEN_DREAMS", 0),
):
recovered = await dream_scheduler.recover_overdue_dreams()

assert recovered == 1
execute_dream.assert_awaited_once_with(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Make this regression test hermetic by pinning ENABLED_TYPES.

recover_overdue_dreams() executes once per enabled dream type. The recovered == 1 and assert_awaited_once_with(...) assertions only hold while config enables exactly one type, so this test will start failing as soon as the default list grows.

Suggested change
         with (
             patch("src.dreamer.dream_scheduler.tracked_db", mock_tracked_db),
             patch.object(dream_scheduler, "execute_dream", new_callable=AsyncMock) as execute_dream,
             patch("src.dreamer.dream_scheduler.settings.DREAM.DOCUMENT_THRESHOLD", 50),
             patch("src.dreamer.dream_scheduler.settings.DREAM.IDLE_TIMEOUT_MINUTES", 60),
             patch("src.dreamer.dream_scheduler.settings.DREAM.MIN_HOURS_BETWEEN_DREAMS", 0),
+            patch("src.dreamer.dream_scheduler.settings.DREAM.ENABLED_TYPES", ["omni"]),
         ):
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/dreamer/test_dream_scheduler.py` around lines 505 - 515, The test is
non-hermetic because recover_overdue_dreams() iterates over
settings.DREAM.ENABLED_TYPES; pin the enabled types in this test so it only runs
once. Update the test to patch settings.DREAM.ENABLED_TYPES (alongside the
existing patches) to a single-element list matching the dream type the test
expects, ensuring recover_overdue_dreams() yields one recovery and
execute_dream.assert_awaited_once_with(...) continues to be valid for the single
enabled type.

@freemanconsulting freemanconsulting force-pushed the fix/deriver-prompt-schema-alignment branch from 83d4b1a to 169444f Compare April 16, 2026 11:19
Excludes version control metadata and test artifacts from Docker build
context, reducing image size and following security best practices.

Addresses CodeRabbit review on plastic-labs#562.
Addresses CodeRabbit pre-merge check failure (docstring coverage below 80% threshold).

Files updated (docstrings only, no logic changes):
- src/schemas/api.py: 29% → 100% (42 Pydantic models + validators)
- src/utils/files.py: 20% → 100% (file processor classes + methods)
- src/telemetry/prometheus/metrics.py: 6% → 100% (metric enums + functions)
- src/crud/deriver.py: 83% → 100% (module docstring)
- src/models.py: 0% → 100% (12 ORM models + __repr__ methods)
- src/config.py: 58% → 100% (20 settings classes + validators)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant