fix: harden deriver prompt for strict PromptRepresentation output#562
fix: harden deriver prompt for strict PromptRepresentation output#562freemanconsulting wants to merge 9 commits intoplastic-labs:mainfrom
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughUpdates the deriver prompt to define extracted items as "explicit atomic observations" with structured JSON output, introduces an OUTPUT CONTRACT constraint requiring objects with only Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/deriver/prompts.py`:
- Around line 53-55: The EXPLICIT examples in src/deriver/prompts.py include
inferential outputs ("{peer_id} lives in NYC" and "completed high school or
equivalent") that violate the “directly derived observations only” rule; update
the EXPLICIT example set (the lines containing the examples with "I took my dog
for a walk in NYC" and "{peer_id} attended college") to remove any inferred
facts and replace them with only literal observations directly supported by the
input (e.g., keep "has a dog" for the dog-walk example but remove or change
"lives in NYC" to something like "was in NYC" and remove the high-school
inference, ensuring the EXPLICIT examples in prompts.py contain no inferred
generalizations).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 815fecdb-93c9-4ee2-a788-253a68e15442
📒 Files selected for processing (2)
src/deriver/prompts.pytests/deriver/test_prompts.py
6184e3a to
9e8bfd6
Compare
There was a problem hiding this comment.
Actionable comments posted: 6
🧹 Nitpick comments (2)
src/models.py (1)
30-30: Prefer absolute import for the newly added settings dependency.Line 30 introduces a relative import for
settings. Considerfrom src.config import settingsto match project import preference.As per coding guidelines "Follow isort conventions with absolute imports preferred".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/models.py` at line 30, Replace the relative import "from .config import settings" with an absolute import to follow project conventions; update the import statement where models.py imports settings to use the module path starting from the package root (e.g., import settings from src.config) so references to the settings symbol remain the same and isort/linters will accept it.Dockerfile (1)
42-42: Useuv run pythonfor Python command consistency.Line 42 currently invokes
pythondirectly. Usinguv run pythonkeeps command behavior aligned with repo convention and avoids relying on PATH ordering.Based on learnings: Always use
uv runoruvto prefix any commands related to Python to ensure virtual environment is used.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@Dockerfile` at line 42, Change the Dockerfile RUN that invokes python directly to use the repo's virtual-env wrapper by prefixing with "uv run" (replace the existing "RUN python - <<'PY'" invocation with "RUN uv run python - <<'PY'") so the container uses the project's virtual environment consistently; ensure any subsequent heredoc content remains unchanged and that the "uv" wrapper is installed/available in the build environment before this RUN step.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@Dockerfile`:
- Line 34: The Dockerfile uses a broad COPY . /app which pulls everything into
the image; update the .dockerignore to add entries for ".git" and "tests" so git
metadata and test artifacts are excluded from the build context and image.
Locate the existing .dockerignore file and append the two lines ".git" and
"tests" (one per line), then rebuild to verify that COPY . /app no longer
includes those paths.
- Around line 42-53: The download code uses url, cache_key and cache_path and
currently lacks a network timeout and integrity checks; modify the inline Python
to call urllib.request.urlopen(url, timeout=10) (or use urllib.request.Request +
context) and compute a cryptographic checksum (e.g. sha256) of the downloaded
bytes, comparing it against a hard-coded expected value before writing to
cache_path; perform the download into a temporary file (use
tempfile.NamedTemporaryFile or write to cache_path.with_suffix('.tmp')), verify
checksum, then atomically replace with os.replace so corrupted/partial files
never become final, and log and abort (do not write cache) on timeout or
checksum mismatch.
In `@src/deriver/queue_manager.py`:
- Around line 120-123: The except block after calling
self.dream_scheduler.recover_overdue_dreams() should preserve the traceback and
send the error to Sentry: replace the simple logger.warning("Failed overdue
dream recovery at startup: %s", e) with exception-style logging that includes
the stack (e.g., logger.exception or logger.warning(..., exc_info=True)) and
call the same Sentry reporting helper used elsewhere (e.g.,
sentry_sdk.capture_exception or the project's capture/report helper) so
recover_overdue_dreams, the startup recovery block, and the exception are all
logged with full stack context for debugging.
In `@src/dreamer/dream_scheduler.py`:
- Around line 263-298: Wrap the per-row and per-dream-type work in try/except so
one bad recovery row or type doesn't abort the whole recovery: around the logic
that converts last_dream_document_count (int(last_dream_document_count or 0)),
constructs DreamType(dream_type), and calls await self.execute_dream(...), catch
exceptions, log an error with context (workspace_name, observer, observed,
dream_type, and the exception), increment no global counters that would stop
processing, and continue to the next dream_type or next row; ensure invalid
DreamType values are handled gracefully and that failures in execute_dream do
not break the outer loop so remaining rows in rows are still processed.
In `@src/models.py`:
- Around line 282-283: The model's embedding column (embedding:
MappedColumn[Any] = mapped_column(Vector(settings.VECTOR_STORE.DIMENSIONS),
nullable=True)) and the embedding client (output_dimensionality: 1536 in
src/embedding_client.py) must stay in sync with
settings.VECTOR_STORE.DIMENSIONS; update src/embedding_client.py to read and use
settings.VECTOR_STORE.DIMENSIONS instead of hardcoded 1536, and add a database
migration that alters the Vector(...) column type in the ORM to the new
dimension when VECTOR_STORE.DIMENSIONS changes so stored vectors and similarity
queries match the configured dimensionality.
In `@tests/dreamer/test_dream_scheduler.py`:
- Around line 505-515: The test is non-hermetic because recover_overdue_dreams()
iterates over settings.DREAM.ENABLED_TYPES; pin the enabled types in this test
so it only runs once. Update the test to patch settings.DREAM.ENABLED_TYPES
(alongside the existing patches) to a single-element list matching the dream
type the test expects, ensuring recover_overdue_dreams() yields one recovery and
execute_dream.assert_awaited_once_with(...) continues to be valid for the single
enabled type.
---
Nitpick comments:
In `@Dockerfile`:
- Line 42: Change the Dockerfile RUN that invokes python directly to use the
repo's virtual-env wrapper by prefixing with "uv run" (replace the existing "RUN
python - <<'PY'" invocation with "RUN uv run python - <<'PY'") so the container
uses the project's virtual environment consistently; ensure any subsequent
heredoc content remains unchanged and that the "uv" wrapper is
installed/available in the build environment before this RUN step.
In `@src/models.py`:
- Line 30: Replace the relative import "from .config import settings" with an
absolute import to follow project conventions; update the import statement where
models.py imports settings to use the module path starting from the package root
(e.g., import settings from src.config) so references to the settings symbol
remain the same and isort/linters will accept it.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 3ac2ad82-69bb-4ffa-921d-d5590376c94c
📒 Files selected for processing (9)
.dockerignoreDockerfilesrc/deriver/queue_manager.pysrc/dreamer/dream_scheduler.pysrc/dreamer/specialists.pysrc/models.pysrc/utils/agent_tools.pytests/dreamer/test_dream_scheduler.pytests/dreamer/test_specialists.py
✅ Files skipped from review due to trivial changes (1)
- .dockerignore
| ENV PATH="/app/.venv/bin:$PATH" | ||
|
|
||
| # Sync the project into a virtual environment, using the frozen lockfile | ||
| COPY . /app |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== .dockerignore =="
if [ -f .dockerignore ]; then
nl -ba .dockerignore
else
echo "Missing .dockerignore"
fi
echo
echo "== Potentially sensitive or high-churn entries in repo root =="
fd -HI -d 1 '^(\\.git|\\.env.*|\\.venv|node_modules|tests|docs|dist|build)$' .Repository: plastic-labs/honcho
Length of output: 142
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== .dockerignore =="
if [ -f .dockerignore ]; then
cat .dockerignore
else
echo "Missing .dockerignore"
fi
echo ""
echo "== Potentially sensitive or high-churn entries in repo root =="
for pattern in '.git' '.env*' '.venv' 'node_modules' 'tests' 'docs' 'dist' 'build'; do
if [ -e "$pattern" ] 2>/dev/null; then
echo "Found: $pattern"
fi
doneRepository: plastic-labs/honcho
Length of output: 345
Add .git and tests to .dockerignore to prevent unnecessary bloat and follow security best practices.
The .dockerignore file exists and covers most sensitive paths (.env, .env.template, __pycache__, .venv), but is missing explicit exclusions for:
.git/— Version control metadata can leak information and significantly increases image sizetests/— Test artifacts are unnecessary in production images
Add these two patterns to .dockerignore:
.git
tests
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@Dockerfile` at line 34, The Dockerfile uses a broad COPY . /app which pulls
everything into the image; update the .dockerignore to add entries for ".git"
and "tests" so git metadata and test artifacts are excluded from the build
context and image. Locate the existing .dockerignore file and append the two
lines ".git" and "tests" (one per line), then rebuild to verify that COPY . /app
no longer includes those paths.
| RUN python - <<'PY' | ||
| import hashlib | ||
| import pathlib | ||
| import urllib.request | ||
| url = 'https://openaipublic.blob.core.windows.net/encodings/o200k_base.tiktoken' | ||
| cache_key = hashlib.sha1(url.encode()).hexdigest() | ||
| cache_path = pathlib.Path('/tmp/data-gym-cache') / cache_key | ||
| cache_path.parent.mkdir(parents=True, exist_ok=True) | ||
| if not cache_path.exists(): | ||
| with urllib.request.urlopen(url) as resp: | ||
| cache_path.write_bytes(resp.read()) | ||
| print(f'cached {url} -> {cache_path}') |
There was a problem hiding this comment.
Harden the preseed download with timeout and checksum verification.
Lines 51-52 download and trust remote bytes without a timeout or integrity check. A transient hang or corrupted payload can poison cache used at startup by tokenizer initialization (src/utils/tokens.py, src/embedding_client.py, src/schemas/api.py).
🔧 Suggested hardening patch
RUN python - <<'PY'
import hashlib
import pathlib
import urllib.request
+import tempfile
+
url = 'https://openaipublic.blob.core.windows.net/encodings/o200k_base.tiktoken'
+expected_sha256 = '<pin-from-trusted-source>'
cache_key = hashlib.sha1(url.encode()).hexdigest()
cache_path = pathlib.Path('/tmp/data-gym-cache') / cache_key
cache_path.parent.mkdir(parents=True, exist_ok=True)
if not cache_path.exists():
- with urllib.request.urlopen(url) as resp:
- cache_path.write_bytes(resp.read())
+ with urllib.request.urlopen(url, timeout=30) as resp:
+ data = resp.read()
+ digest = hashlib.sha256(data).hexdigest()
+ if digest != expected_sha256:
+ raise RuntimeError(f'Checksum mismatch for {url}: {digest}')
+ tmp = pathlib.Path(tempfile.mkstemp(dir=str(cache_path.parent))[1])
+ tmp.write_bytes(data)
+ tmp.replace(cache_path)
print(f'cached {url} -> {cache_path}')
PY🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@Dockerfile` around lines 42 - 53, The download code uses url, cache_key and
cache_path and currently lacks a network timeout and integrity checks; modify
the inline Python to call urllib.request.urlopen(url, timeout=10) (or use
urllib.request.Request + context) and compute a cryptographic checksum (e.g.
sha256) of the downloaded bytes, comparing it against a hard-coded expected
value before writing to cache_path; perform the download into a temporary file
(use tempfile.NamedTemporaryFile or write to cache_path.with_suffix('.tmp')),
verify checksum, then atomically replace with os.replace so corrupted/partial
files never become final, and log and abort (do not write cache) on timeout or
checksum mismatch.
| try: | ||
| recovered_dreams = await self.dream_scheduler.recover_overdue_dreams() | ||
| except Exception as e: | ||
| logger.warning("Failed overdue dream recovery at startup: %s", e) |
There was a problem hiding this comment.
Keep the traceback for startup recovery failures.
This path currently logs only str(e), so if overdue-dream recovery partially fails you lose the stack needed to identify the bad row or dream type. Keeping startup non-fatal is fine, but this should use exception logging here, and ideally report to Sentry like the other init/polling error paths.
Suggested change
- except Exception as e:
- logger.warning("Failed overdue dream recovery at startup: %s", e)
+ except Exception as e:
+ logger.exception("Failed overdue dream recovery at startup")
+ if settings.SENTRY.ENABLED:
+ sentry_sdk.capture_exception(e)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| try: | |
| recovered_dreams = await self.dream_scheduler.recover_overdue_dreams() | |
| except Exception as e: | |
| logger.warning("Failed overdue dream recovery at startup: %s", e) | |
| try: | |
| recovered_dreams = await self.dream_scheduler.recover_overdue_dreams() | |
| except Exception as e: | |
| logger.exception("Failed overdue dream recovery at startup") | |
| if settings.SENTRY.ENABLED: | |
| sentry_sdk.capture_exception(e) |
🧰 Tools
🪛 Ruff (0.15.10)
[warning] 122-122: Do not catch blind exception: Exception
(BLE001)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/deriver/queue_manager.py` around lines 120 - 123, The except block after
calling self.dream_scheduler.recover_overdue_dreams() should preserve the
traceback and send the error to Sentry: replace the simple
logger.warning("Failed overdue dream recovery at startup: %s", e) with
exception-style logging that includes the stack (e.g., logger.exception or
logger.warning(..., exc_info=True)) and call the same Sentry reporting helper
used elsewhere (e.g., sentry_sdk.capture_exception or the project's
capture/report helper) so recover_overdue_dreams, the startup recovery block,
and the exception are all logged with full stack context for debugging.
| for ( | ||
| workspace_name, | ||
| observer, | ||
| observed, | ||
| doc_count, | ||
| last_dream_document_count, | ||
| last_dream_at, | ||
| last_document_at, | ||
| ) in rows: | ||
| if doc_count < settings.DREAM.DOCUMENT_THRESHOLD: | ||
| continue | ||
| if last_document_at is None or last_document_at > idle_cutoff: | ||
| continue | ||
|
|
||
| last_dream_document_count_int = int(last_dream_document_count or 0) | ||
| if doc_count - last_dream_document_count_int < settings.DREAM.DOCUMENT_THRESHOLD: | ||
| continue | ||
|
|
||
| if last_dream_at: | ||
| try: | ||
| last_dream_dt = datetime.fromisoformat(last_dream_at) | ||
| except (TypeError, ValueError): | ||
| last_dream_dt = None | ||
| if last_dream_dt is not None: | ||
| hours_since_last_dream = (now - last_dream_dt).total_seconds() / 3600 | ||
| if hours_since_last_dream < settings.DREAM.MIN_HOURS_BETWEEN_DREAMS: | ||
| continue | ||
|
|
||
| for dream_type in settings.DREAM.ENABLED_TYPES: | ||
| await self.execute_dream( | ||
| workspace_name, | ||
| DreamType(dream_type), | ||
| observer=observer, | ||
| observed=observed, | ||
| ) | ||
| recovered += 1 |
There was a problem hiding this comment.
Don’t let one bad recovery row abort the rest of startup recovery.
Any failure in int(last_dream_document_count), DreamType(dream_type), or execute_dream(...) exits the whole method. Since QueueManager.initialize() catches that once and keeps booting, the remaining overdue collections are silently skipped. Recovery should handle failures per row/type, log them, and continue.
Suggested change
for (
workspace_name,
observer,
observed,
doc_count,
last_dream_document_count,
last_dream_at,
last_document_at,
) in rows:
- if doc_count < settings.DREAM.DOCUMENT_THRESHOLD:
- continue
- if last_document_at is None or last_document_at > idle_cutoff:
- continue
-
- last_dream_document_count_int = int(last_dream_document_count or 0)
- if doc_count - last_dream_document_count_int < settings.DREAM.DOCUMENT_THRESHOLD:
- continue
-
- if last_dream_at:
- try:
- last_dream_dt = datetime.fromisoformat(last_dream_at)
- except (TypeError, ValueError):
- last_dream_dt = None
- if last_dream_dt is not None:
- hours_since_last_dream = (now - last_dream_dt).total_seconds() / 3600
- if hours_since_last_dream < settings.DREAM.MIN_HOURS_BETWEEN_DREAMS:
- continue
-
- for dream_type in settings.DREAM.ENABLED_TYPES:
- await self.execute_dream(
- workspace_name,
- DreamType(dream_type),
- observer=observer,
- observed=observed,
- )
- recovered += 1
+ try:
+ if doc_count < settings.DREAM.DOCUMENT_THRESHOLD:
+ continue
+ if last_document_at is None or last_document_at > idle_cutoff:
+ continue
+
+ last_dream_document_count_int = int(last_dream_document_count or 0)
+ if (
+ doc_count - last_dream_document_count_int
+ < settings.DREAM.DOCUMENT_THRESHOLD
+ ):
+ continue
+
+ if last_dream_at:
+ try:
+ last_dream_dt = datetime.fromisoformat(last_dream_at)
+ except (TypeError, ValueError):
+ last_dream_dt = None
+ if last_dream_dt is not None:
+ hours_since_last_dream = (
+ now - last_dream_dt
+ ).total_seconds() / 3600
+ if (
+ hours_since_last_dream
+ < settings.DREAM.MIN_HOURS_BETWEEN_DREAMS
+ ):
+ continue
+
+ for dream_type in settings.DREAM.ENABLED_TYPES:
+ try:
+ await self.execute_dream(
+ workspace_name,
+ DreamType(dream_type),
+ observer=observer,
+ observed=observed,
+ )
+ except Exception:
+ logger.exception(
+ "Failed recovering overdue dream",
+ extra={
+ "workspace_name": workspace_name,
+ "observer": observer,
+ "observed": observed,
+ "dream_type": dream_type,
+ },
+ )
+ else:
+ recovered += 1
+ except (TypeError, ValueError):
+ logger.exception(
+ "Invalid overdue dream metadata during recovery",
+ extra={
+ "workspace_name": workspace_name,
+ "observer": observer,
+ "observed": observed,
+ },
+ )
+ continue🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/dreamer/dream_scheduler.py` around lines 263 - 298, Wrap the per-row and
per-dream-type work in try/except so one bad recovery row or type doesn't abort
the whole recovery: around the logic that converts last_dream_document_count
(int(last_dream_document_count or 0)), constructs DreamType(dream_type), and
calls await self.execute_dream(...), catch exceptions, log an error with context
(workspace_name, observer, observed, dream_type, and the exception), increment
no global counters that would stop processing, and continue to the next
dream_type or next row; ensure invalid DreamType values are handled gracefully
and that failures in execute_dream do not break the outer loop so remaining rows
in rows are still processed.
| embedding: MappedColumn[Any] = mapped_column(Vector(settings.VECTOR_STORE.DIMENSIONS), nullable=True) | ||
| message_id: Mapped[str] = mapped_column( |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Model vector type definitions =="
rg -nP 'Vector\(\s*settings\.VECTOR_STORE\.DIMENSIONS|Vector\(\s*1536\s*\)' src/models.py
echo
echo "== Migration vector type definitions =="
rg -nP 'Vector\(\s*1536\s*\)|alter_column\([^)]*embedding' migrations/versions
echo
echo "== Embedding output dimensionality sources =="
rg -nP 'output_dimensionality|VECTOR_STORE\.DIMENSIONS|OPENAI_EMBEDDING_MODEL' src/embedding_client.py src/config.pyRepository: plastic-labs/honcho
Length of output: 1477
🏁 Script executed:
#!/bin/bash
echo "== Line 30 of src/models.py (import statement) =="
sed -n '28,32p' src/models.py
echo
echo "== VECTOR_STORE.DIMENSIONS in config =="
rg -nA 5 'VECTOR_STORE' src/config.py | head -20
echo
echo "== Check if absolute import path exists (src/config.py) =="
ls -la src/config.py 2>&1 | head -5Repository: plastic-labs/honcho
Length of output: 1025
🏁 Script executed:
#!/bin/bash
echo "== VECTOR_STORE.DIMENSIONS definition =="
rg -nA 30 'class VectorStoreSettings' src/config.py | grep -i dimensions
echo
echo "== Full VectorStoreSettings class (first 50 lines) =="
rg -nA 50 'class VectorStoreSettings' src/config.py | head -60Repository: plastic-labs/honcho
Length of output: 2092
Coordinate vector dimension changes across ORM, migrations, and embedding generation.
Models at lines 282 and 390 now reference settings.VECTOR_STORE.DIMENSIONS, which defaults to 1536 but can be overridden via environment variable. However, existing migrations hardcode Vector(1536) in column definitions, and the embedding client hardcodes output_dimensionality: 1536. If VECTOR_STORE.DIMENSIONS is changed at runtime, vector insertion and similarity search queries will fail due to dimension mismatch against the fixed vector(1536) schema.
Ensure that any change to VECTOR_STORE.DIMENSIONS is coordinated with:
- A migration to alter column types to the new dimension
- Updates to
src/embedding_client.pyto use the configured dimension
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/models.py` around lines 282 - 283, The model's embedding column
(embedding: MappedColumn[Any] =
mapped_column(Vector(settings.VECTOR_STORE.DIMENSIONS), nullable=True)) and the
embedding client (output_dimensionality: 1536 in src/embedding_client.py) must
stay in sync with settings.VECTOR_STORE.DIMENSIONS; update
src/embedding_client.py to read and use settings.VECTOR_STORE.DIMENSIONS instead
of hardcoded 1536, and add a database migration that alters the Vector(...)
column type in the ORM to the new dimension when VECTOR_STORE.DIMENSIONS changes
so stored vectors and similarity queries match the configured dimensionality.
| with ( | ||
| patch("src.dreamer.dream_scheduler.tracked_db", mock_tracked_db), | ||
| patch.object(dream_scheduler, "execute_dream", new_callable=AsyncMock) as execute_dream, | ||
| patch("src.dreamer.dream_scheduler.settings.DREAM.DOCUMENT_THRESHOLD", 50), | ||
| patch("src.dreamer.dream_scheduler.settings.DREAM.IDLE_TIMEOUT_MINUTES", 60), | ||
| patch("src.dreamer.dream_scheduler.settings.DREAM.MIN_HOURS_BETWEEN_DREAMS", 0), | ||
| ): | ||
| recovered = await dream_scheduler.recover_overdue_dreams() | ||
|
|
||
| assert recovered == 1 | ||
| execute_dream.assert_awaited_once_with( |
There was a problem hiding this comment.
Make this regression test hermetic by pinning ENABLED_TYPES.
recover_overdue_dreams() executes once per enabled dream type. The recovered == 1 and assert_awaited_once_with(...) assertions only hold while config enables exactly one type, so this test will start failing as soon as the default list grows.
Suggested change
with (
patch("src.dreamer.dream_scheduler.tracked_db", mock_tracked_db),
patch.object(dream_scheduler, "execute_dream", new_callable=AsyncMock) as execute_dream,
patch("src.dreamer.dream_scheduler.settings.DREAM.DOCUMENT_THRESHOLD", 50),
patch("src.dreamer.dream_scheduler.settings.DREAM.IDLE_TIMEOUT_MINUTES", 60),
patch("src.dreamer.dream_scheduler.settings.DREAM.MIN_HOURS_BETWEEN_DREAMS", 0),
+ patch("src.dreamer.dream_scheduler.settings.DREAM.ENABLED_TYPES", ["omni"]),
):🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/dreamer/test_dream_scheduler.py` around lines 505 - 515, The test is
non-hermetic because recover_overdue_dreams() iterates over
settings.DREAM.ENABLED_TYPES; pin the enabled types in this test so it only runs
once. Update the test to patch settings.DREAM.ENABLED_TYPES (alongside the
existing patches) to a single-element list matching the dream type the test
expects, ensuring recover_overdue_dreams() yields one recovery and
execute_dream.assert_awaited_once_with(...) continues to be valid for the single
enabled type.
83d4b1a to
169444f
Compare
Excludes version control metadata and test artifacts from Docker build context, reducing image size and following security best practices. Addresses CodeRabbit review on plastic-labs#562.
…ry budget enforcement
Addresses CodeRabbit pre-merge check failure (docstring coverage below 80% threshold). Files updated (docstrings only, no logic changes): - src/schemas/api.py: 29% → 100% (42 Pydantic models + validators) - src/utils/files.py: 20% → 100% (file processor classes + methods) - src/telemetry/prometheus/metrics.py: 6% → 100% (metric enums + functions) - src/crud/deriver.py: 83% → 100% (module docstring) - src/models.py: 0% → 100% (12 ORM models + __repr__ methods) - src/config.py: 58% → 100% (20 settings classes + validators)
Summary
PromptRepresentationschemaexplicit[].contentand forbid drift tofact/sourcekeysWhy
Local/open-weight models were drifting toward
fact-shaped objects because the prompt semantics emphasized facts/conclusions while the schema requiredcontent. This change makes the textual contract reinforce the schema instead of fighting it.Validation
explicit[].fact-> validation failure -> empty representation)explicit[].contentobjects in direct structured callscontentinstructions and exact JSON exampleSummary by CodeRabbit
New Features
Tests
Chores