feat: support ANTHROPIC_BASE_URL for custom Anthropic-compatible endpoints#567
feat: support ANTHROPIC_BASE_URL for custom Anthropic-compatible endpoints#567cyijun wants to merge 1 commit intoplastic-labs:mainfrom
Conversation
WalkthroughAdded configurable Anthropic base URL and a configurable default embedding model; Anthropic client construction now conditionally uses the base URL, embedding client reads embedding model from settings, and DB vector column dimensions are made configurable via settings. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
9d8bc0e to
ff4327d
Compare
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/config.py`:
- Line 218: The EMBEDDING_MODEL default was changed to a non-null string which
breaks OpenRouter because provider-specific fallbacks in src/embedding_client.py
(referenced around the or expressions used when initializing models) are
bypassed; revert EMBEDDING_MODEL to be optional (EMBEDDING_MODEL: str | None =
None) or, alternatively, move the provider-specific default resolution into
_EmbeddingClient.__init__ so that when EMBEDDING_MODEL is None you set model =
"openai/text-embedding-3-small" for provider == "openrouter" (and other
provider-specific names as needed) instead of relying on or fallbacks; update
the logic where model is chosen (the expressions near lines around 55 and 64 in
embedding_client.py) to use the new None check or the resolved per-provider
default.
In `@src/embedding_client.py`:
- Line 55: The current fallback using "or" never runs because
settings.LLM.EMBEDDING_MODEL defaults to a non-empty string; change the
assignment in embedding_client.py so it explicitly handles None/empty and sets
the OpenRouter-prefixed default: read settings.LLM.EMBEDDING_MODEL (treat it as
Optional[str]), and if it is None or "" set self.model =
"openai/text-embedding-3-small", otherwise set self.model =
settings.LLM.EMBEDDING_MODEL (and optionally normalize a bare
"text-embedding-3-small" to "openai/text-embedding-3-small"); ensure
EMBEDDING_MODEL in config is made Optional if needed so the fallback can
trigger.
In `@src/models.py`:
- Line 30: models.py currently imports settings
(AppSettings/VECTOR_STORE_DIMENSIONS) at import time which couples ORM schema to
runtime config; remove the runtime config import and make the column definition
static: delete "from .config import settings" and replace any usage of
settings.VECTOR_STORE_DIMENSIONS in the Column definition (e.g., the dimensions
field on your Vector/embedding model) with a fixed, module-level constant (e.g.,
DEFAULT_VECTOR_DIMENSIONS) or omit the runtime default entirely (just
Column(Integer, nullable=False)). Handle any variations to dimension via
explicit Alembic migrations rather than reading settings at import time so
autogenerate remains deterministic.
- Line 282: The models currently set the pgvector column size from
settings.VECTOR_STORE.DIMENSIONS which mismatches migrations; revert the
embedding column to a static Vector(1536) by replacing the
mapped_column(Vector(settings.VECTOR_STORE.DIMENSIONS), ...) with
mapped_column(Vector(1536), ...) (the symbol to change is the embedding:
MappedColumn[Any] field declaration), and avoid threading DIMENSIONS through the
model in this PR; leave the existing hardcoded output_dimensionality=1536 uses
in embedding_client.py as-is for now to keep runtime embedding size consistent
with the schema, and add a TODO comment near the embedding field and in
embedding_client.py pointing to a follow-up ticket to implement dynamic
migrations and configurable output_dimensionality.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: bf0782bc-0c3f-4cb7-b294-1b83ae4d3af8
📒 Files selected for processing (4)
src/config.pysrc/embedding_client.pysrc/models.pysrc/utils/clients.py
✅ Files skipped from review due to trivial changes (1)
- src/utils/clients.py
| VLLM_BASE_URL: str | None = None | ||
|
|
||
| EMBEDDING_PROVIDER: Literal["openai", "gemini", "openrouter"] = "openai" | ||
| EMBEDDING_MODEL: str = "text-embedding-3-small" |
There was a problem hiding this comment.
EMBEDDING_MODEL default breaks OpenRouter out‑of‑the‑box.
For EMBEDDING_PROVIDER="openrouter", the prior hardcoded default was "openai/text-embedding-3-small" (the provider‑prefixed form OpenRouter requires). After this change, the global default becomes "text-embedding-3-small", and the or fallback in src/embedding_client.py (lines 55 and 64) is effectively dead code because the default is always truthy. Users on OpenRouter who don't explicitly override LLM_EMBEDDING_MODEL will now send an unprefixed model name and get a provider error.
Consider one of:
- Leaving
EMBEDDING_MODEL: str | None = Noneand keeping per‑provider fallbacks meaningful, or - Resolving the default per‑provider inside
_EmbeddingClient.__init__rather than viaor.
Proposed change
- EMBEDDING_MODEL: str = "text-embedding-3-small"
+ EMBEDDING_MODEL: str | None = None🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/config.py` at line 218, The EMBEDDING_MODEL default was changed to a
non-null string which breaks OpenRouter because provider-specific fallbacks in
src/embedding_client.py (referenced around the or expressions used when
initializing models) are bypassed; revert EMBEDDING_MODEL to be optional
(EMBEDDING_MODEL: str | None = None) or, alternatively, move the
provider-specific default resolution into _EmbeddingClient.__init__ so that when
EMBEDDING_MODEL is None you set model = "openai/text-embedding-3-small" for
provider == "openrouter" (and other provider-specific names as needed) instead
of relying on or fallbacks; update the logic where model is chosen (the
expressions near lines around 55 and 64 in embedding_client.py) to use the new
None check or the resolved per-provider default.
| ) | ||
| self.client = AsyncOpenAI(api_key=api_key, base_url=base_url) | ||
| self.model = "openai/text-embedding-3-small" | ||
| self.model = settings.LLM.EMBEDDING_MODEL or "openai/text-embedding-3-small" |
There was a problem hiding this comment.
OpenRouter default regressed due to or fallback being unreachable.
Since settings.LLM.EMBEDDING_MODEL now defaults to the non‑empty string "text-embedding-3-small" (see src/config.py line 218), the or "openai/text-embedding-3-small" branch can never trigger under default config. OpenRouter requires the openai/ prefix for this model, so default‑configured OpenRouter users will start getting model‑not‑found errors.
Suggested fix (pairs with making EMBEDDING_MODEL Optional)
- self.model = settings.LLM.EMBEDDING_MODEL or "openai/text-embedding-3-small"
+ self.model = (
+ settings.LLM.EMBEDDING_MODEL
+ if settings.LLM.EMBEDDING_MODEL
+ else "openai/text-embedding-3-small"
+ )This only works correctly if EMBEDDING_MODEL is str | None (default None) rather than a non‑empty string default.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| self.model = settings.LLM.EMBEDDING_MODEL or "openai/text-embedding-3-small" | |
| self.model = ( | |
| settings.LLM.EMBEDDING_MODEL | |
| if settings.LLM.EMBEDDING_MODEL | |
| else "openai/text-embedding-3-small" | |
| ) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/embedding_client.py` at line 55, The current fallback using "or" never
runs because settings.LLM.EMBEDDING_MODEL defaults to a non-empty string; change
the assignment in embedding_client.py so it explicitly handles None/empty and
sets the OpenRouter-prefixed default: read settings.LLM.EMBEDDING_MODEL (treat
it as Optional[str]), and if it is None or "" set self.model =
"openai/text-embedding-3-small", otherwise set self.model =
settings.LLM.EMBEDDING_MODEL (and optionally normalize a bare
"text-embedding-3-small" to "openai/text-embedding-3-small"); ensure
EMBEDDING_MODEL in config is made Optional if needed so the fallback can
trigger.
|
|
||
| from src.utils.types import DocumentLevel, TaskType, VectorSyncState | ||
|
|
||
| from .config import settings |
There was a problem hiding this comment.
Importing settings into src/models.py creates a coupling to runtime config.
models.py is imported very early (and by Alembic's env.py typically); pulling in AppSettings() here means ORM schema now depends on environment/TOML at import time. This makes autogenerated migrations non‑deterministic across environments and can cause alembic revision --autogenerate to produce diffs driven by whichever VECTOR_STORE_DIMENSIONS was set on the maintainer's machine. Prefer keeping the column definition static and handling dimension via migrations, or isolating the settings read behind an explicit constant.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/models.py` at line 30, models.py currently imports settings
(AppSettings/VECTOR_STORE_DIMENSIONS) at import time which couples ORM schema to
runtime config; remove the runtime config import and make the column definition
static: delete "from .config import settings" and replace any usage of
settings.VECTOR_STORE_DIMENSIONS in the Column definition (e.g., the dimensions
field on your Vector/embedding model) with a fixed, module-level constant (e.g.,
DEFAULT_VECTOR_DIMENSIONS) or omit the runtime default entirely (just
Column(Integer, nullable=False)). Handle any variations to dimension via
explicit Alembic migrations rather than reading settings at import time so
autogenerate remains deterministic.
| ) | ||
| content: Mapped[str] = mapped_column(TEXT) | ||
| embedding: MappedColumn[Any] = mapped_column(Vector(1536), nullable=True) | ||
| embedding: MappedColumn[Any] = mapped_column(Vector(settings.VECTOR_STORE.DIMENSIONS), nullable=True) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
rg -nP 'Vector\(\s*\d+'
rg -nP 'output_dimensionality' -C2
fd -e py . migrationsRepository: plastic-labs/honcho
Length of output: 3399
🏁 Script executed:
# Check the actual content of src/models.py around lines 282 and 390
sed -n '275,295p' src/models.py
echo "---"
sed -n '385,395p' src/models.pyRepository: plastic-labs/honcho
Length of output: 1584
🏁 Script executed:
# Get more details on the recent migrations, particularly the latest one
head -50 migrations/versions/119a52b73c60_support_external_embeddings.py
tail -20 migrations/versions/119a52b73c60_support_external_embeddings.pyRepository: plastic-labs/honcho
Length of output: 2820
🏁 Script executed:
# Check if there's a newer migration after 119a52b73c60
git -C . log --oneline --follow -- migrations/versions/ 2>/dev/null | head -20 || echo "Git not available"
ls -lt migrations/versions/ | head -10Repository: plastic-labs/honcho
Length of output: 1070
Schema/migration mismatch when VECTOR_STORE.DIMENSIONS != 1536.
The ORM column dimensionality is now driven by settings.VECTOR_STORE.DIMENSIONS, but the migrations still hardcode Vector(1536). The latest migration 119a52b73c60_support_external_embeddings.py only makes embeddings nullable; it preserves the hardcoded dimension. Any deployment with VECTOR_STORE_DIMENSIONS set to a non-1536 value will cause SQLAlchemy to insert vectors of a different dimensionality than the actual Postgres column, which pgvector will reject at write time.
Additionally, src/embedding_client.py hardcodes config={"output_dimensionality": 1536} for Gemini on lines 85, 119, and 255, so switching DIMENSIONS will not actually change the embedding size produced—only the declared column width.
At minimum:
- Keep
Vector(1536)in models and drop theDIMENSIONScoupling in this PR, or - Add an Alembic migration that alters both embedding columns to match
settings.VECTOR_STORE.DIMENSIONS, and thread the dimension through the embedding client'soutput_dimensionalityas well
Given the PR's stated scope is Anthropic base URL support, introducing dynamic pgvector dimensions is out of scope—consider splitting it out.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/models.py` at line 282, The models currently set the pgvector column size
from settings.VECTOR_STORE.DIMENSIONS which mismatches migrations; revert the
embedding column to a static Vector(1536) by replacing the
mapped_column(Vector(settings.VECTOR_STORE.DIMENSIONS), ...) with
mapped_column(Vector(1536), ...) (the symbol to change is the embedding:
MappedColumn[Any] field declaration), and avoid threading DIMENSIONS through the
model in this PR; leave the existing hardcoded output_dimensionality=1536 uses
in embedding_client.py as-is for now to keep runtime embedding size consistent
with the schema, and add a TODO comment near the embedding field and in
embedding_client.py pointing to a follow-up ticket to implement dynamic
migrations and configurable output_dimensionality.
|
Closed in favor of #459 |
This PR adds support for configuring a custom "base_url" for the Anthropic client, enabling the use of Anthropic-compatible endpoints (e.g., Kimi, OpenRouter, etc.).
Changes:
Testing:
Summary by CodeRabbit