Skip to content

feat(inference): add native Anthropic (Claude) provider#2890

Open
johnford2002 wants to merge 6 commits into
karakeep-app:mainfrom
johnford2002:feat/anthropic-provider
Open

feat(inference): add native Anthropic (Claude) provider#2890
johnford2002 wants to merge 6 commits into
karakeep-app:mainfrom
johnford2002:feat/anthropic-provider

Conversation

@johnford2002

Copy link
Copy Markdown

Summary

Adds a first-class Anthropic (Claude) inference provider, selected by ANTHROPIC_API_KEY, using the official @anthropic-ai/sdk Messages API.

Today Claude can only be used by pointing OPENAI_BASE_URL at Anthropic's OpenAI-compatibility endpoint, which ignores strict/json_schema enforcement — so Karakeep's default structured tagging output isn't actually enforced, leading to occasional malformed JSON and failed tagging jobs. This native provider uses Anthropic's Structured Outputs for guaranteed schema conformance.

What it does

  • New AnthropicInferenceClient in packages/shared/inference.ts, selected in InferenceClientFactory.build() when ANTHROPIC_API_KEY is set (precedence: OpenAI → Anthropic → Ollama).
  • Text + image (vision) inference for auto-tagging and summarization.
  • Maps the existing structured / json / plain output modes onto Anthropic's output_config.format json_schema, reusing the same z.toJSONSchema the Ollama client already uses.
  • Defaults to claude-haiku-4-5 when no Claude model is configured (override via INFERENCE_TEXT_MODEL / INFERENCE_IMAGE_MODEL).
  • New env vars: ANTHROPIC_API_KEY, optional ANTHROPIC_BASE_URL. Docs updated under 03-configuration.

The change is purely additive — no existing OpenAI/Ollama logic is modified.

Limitations

  • Anthropic has no embeddings API (they recommend third-party providers), so generateEmbeddingFromText throws a clear, documented error. Semantic search still requires a separate embedding provider (OpenAI/Ollama).

Test plan

  • New unit tests in packages/shared/inference.test.ts (8) cover text, image, structured-output mapping, model-default substitution, and the embeddings error.
  • pnpm format, pnpm lint, pnpm typecheck, pnpm build all pass.
  • @karakeep/shared test suite green.

🤖 Generated with Claude Code

@greptile-apps

greptile-apps Bot commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR adds a first-class Anthropic (Claude) inference provider to Karakeep, selected when ANTHROPIC_API_KEY is set, using the official @anthropic-ai/sdk Messages API with native structured outputs (output_config.format) instead of routing through Anthropic's OpenAI-compatibility endpoint.

  • Adds AnthropicInferenceClient with text and image inference, maps Karakeep's structured/json/plain output modes to Anthropic's json_schema format, and defaults to claude-haiku-4-5 when OpenAI default model names are detected.
  • Registers ANTHROPIC_API_KEY / ANTHROPIC_BASE_URL in config, updates isConfigured, and adds 8 unit tests covering model substitution, output_config shape, token counting, and the embeddings error path.

Confidence Score: 3/5

The additive change is well-scoped, but the Anthropic client silently degrades to plain-text output in json mode when no schema is given — a behavioural difference from the OpenAI path that could cause parse failures for any existing deployment that switches providers with INFERENCE_OUTPUT_SCHEMA=json.

The buildAnthropicOutputConfig function returns undefined (no JSON enforcement) when outputSchema === json and schema === null, whereas OpenAI always sends { type: json_object } in that case. Any caller relying on json-mode JSON output without an explicit schema will silently receive unstructured text from Anthropic and likely break at the parsing step. The rest of the change is correct.

packages/shared/inference.ts — specifically the buildAnthropicOutputConfig helper and the absence of timeout wiring in the AnthropicInferenceClient constructor

Important Files Changed

Filename Overview
packages/shared/inference.ts Adds AnthropicInferenceClient with text/image inference and structured output support; has a silent fallback to plain-text in json mode without a schema, and no configurable timeout
packages/shared/config.ts Adds ANTHROPIC_API_KEY and ANTHROPIC_BASE_URL env vars and wires them into the inference config block; straightforward and correct
packages/shared/inference.test.ts New test file covering model substitution, token counting, output_config shape, and embeddings error; good coverage but fixtures use OpenAI names without explanation
docs/docs/03-configuration/02-different-ai-providers.md Adds Anthropic section documenting the new provider, precedence rules, default model, and embeddings limitation; accurate and clear
packages/shared/package.json Adds @anthropic-ai/sdk ^0.104.1 dependency; correctly scoped to packages/shared
Prompt To Fix All With AI
Fix the following 4 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 4
packages/shared/inference.ts:191-205
**Silent no-op in `json` mode without a schema**

`buildAnthropicOutputConfig` returns `undefined` when `outputSchema === "json"` and `schema === null`. This silently drops JSON enforcement: Anthropic gets a plain-text request instead. The OpenAI path always sends `{ type: "json_object" }` in `json` mode regardless of whether a schema is supplied, so any caller that sets `INFERENCE_OUTPUT_SCHEMA=json` without a schema (which is valid OpenAI usage) will receive unstructured text from Anthropic and likely fail to parse it. At minimum a warning should be logged, or the condition should be tightened to make the gap explicit.

### Issue 2 of 4
packages/shared/inference.ts:393-399
**No configurable timeout for the Anthropic client**

`OpenAIInferenceClient` reads `OPENAI_TIMEOUT_SEC` and passes it as `timeout` to the `OpenAI` constructor. The `AnthropicInferenceClient` passes no timeout to `new Anthropic({...})`, so the SDK's built-in default (~10 minutes) is always used, ignoring any user-configured timeout expectation. Consider adding an `ANTHROPIC_TIMEOUT_SEC` env var (or re-using a generic `INFERENCE_TIMEOUT_SEC`) and wiring it through `AnthropicInferenceConfig`.

### Issue 3 of 4
packages/shared/inference.ts:179-187
**Ambiguous log message: text vs. image model substitution indistinguishable**

`resolveAnthropicModel` is called for both `textModel` and `imageModel`, but the log message always says "No Claude model set … Set INFERENCE_TEXT_MODEL/INFERENCE_IMAGE_MODEL to override." — it never identifies which model slot triggered the substitution. When both are defaulted, users see the same message twice with no way to tell them apart. Passing a `slot` identifier (e.g. `"text"` / `"image"`) would make the message actionable.

### Issue 4 of 4
packages/shared/inference.test.ts:17-25
**Test fixture uses OpenAI model names for an Anthropic client under test**

`makeClient()` defaults `textModel: "gpt-4.1-mini"` and `imageModel: "gpt-4o-mini"`. The tests exercise the substitution path (good), but the helper's name and intent could mislead a future contributor into thinking it simulates a misconfigured OpenAI client rather than an Anthropic client with OpenAI-defaulted models. A brief comment explaining why the OpenAI defaults are intentional here would improve clarity.

Reviews (1): Last reviewed commit: "docs: document the native Anthropic infe..." | Re-trigger Greptile

Comment on lines +191 to +205
function buildAnthropicOutputConfig(
// eslint-disable-next-line @typescript-eslint/no-explicit-any
schema: z.ZodSchema<any> | null,
outputSchema: "structured" | "json" | "plain",
) {
if (!schema || outputSchema === "plain") {
return undefined;
}
return {
format: {
type: "json_schema" as const,
schema: z.toJSONSchema(schema),
},
};
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Silent no-op in json mode without a schema

buildAnthropicOutputConfig returns undefined when outputSchema === "json" and schema === null. This silently drops JSON enforcement: Anthropic gets a plain-text request instead. The OpenAI path always sends { type: "json_object" } in json mode regardless of whether a schema is supplied, so any caller that sets INFERENCE_OUTPUT_SCHEMA=json without a schema (which is valid OpenAI usage) will receive unstructured text from Anthropic and likely fail to parse it. At minimum a warning should be logged, or the condition should be tightened to make the gap explicit.

Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/shared/inference.ts
Line: 191-205

Comment:
**Silent no-op in `json` mode without a schema**

`buildAnthropicOutputConfig` returns `undefined` when `outputSchema === "json"` and `schema === null`. This silently drops JSON enforcement: Anthropic gets a plain-text request instead. The OpenAI path always sends `{ type: "json_object" }` in `json` mode regardless of whether a schema is supplied, so any caller that sets `INFERENCE_OUTPUT_SCHEMA=json` without a schema (which is valid OpenAI usage) will receive unstructured text from Anthropic and likely fail to parse it. At minimum a warning should be logged, or the condition should be tightened to make the gap explicit.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +393 to +399
OPENAI_DEFAULT_IMAGE_MODEL,
);
this.anthropic = new Anthropic({
apiKey: config.apiKey,
baseURL: config.baseURL,
});
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 No configurable timeout for the Anthropic client

OpenAIInferenceClient reads OPENAI_TIMEOUT_SEC and passes it as timeout to the OpenAI constructor. The AnthropicInferenceClient passes no timeout to new Anthropic({...}), so the SDK's built-in default (~10 minutes) is always used, ignoring any user-configured timeout expectation. Consider adding an ANTHROPIC_TIMEOUT_SEC env var (or re-using a generic INFERENCE_TIMEOUT_SEC) and wiring it through AnthropicInferenceConfig.

Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/shared/inference.ts
Line: 393-399

Comment:
**No configurable timeout for the Anthropic client**

`OpenAIInferenceClient` reads `OPENAI_TIMEOUT_SEC` and passes it as `timeout` to the `OpenAI` constructor. The `AnthropicInferenceClient` passes no timeout to `new Anthropic({...})`, so the SDK's built-in default (~10 minutes) is always used, ignoring any user-configured timeout expectation. Consider adding an `ANTHROPIC_TIMEOUT_SEC` env var (or re-using a generic `INFERENCE_TIMEOUT_SEC`) and wiring it through `AnthropicInferenceConfig`.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +179 to +187
function resolveAnthropicModel(model: string, openAIDefault: string): string {
if (model === openAIDefault) {
logger.info(
`[inference] No Claude model set for the Anthropic provider; defaulting to ${ANTHROPIC_DEFAULT_MODEL}. Set INFERENCE_TEXT_MODEL/INFERENCE_IMAGE_MODEL to override.`,
);
return ANTHROPIC_DEFAULT_MODEL;
}
return model;
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Ambiguous log message: text vs. image model substitution indistinguishable

resolveAnthropicModel is called for both textModel and imageModel, but the log message always says "No Claude model set … Set INFERENCE_TEXT_MODEL/INFERENCE_IMAGE_MODEL to override." — it never identifies which model slot triggered the substitution. When both are defaulted, users see the same message twice with no way to tell them apart. Passing a slot identifier (e.g. "text" / "image") would make the message actionable.

Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/shared/inference.ts
Line: 179-187

Comment:
**Ambiguous log message: text vs. image model substitution indistinguishable**

`resolveAnthropicModel` is called for both `textModel` and `imageModel`, but the log message always says "No Claude model set … Set INFERENCE_TEXT_MODEL/INFERENCE_IMAGE_MODEL to override." — it never identifies which model slot triggered the substitution. When both are defaulted, users see the same message twice with no way to tell them apart. Passing a `slot` identifier (e.g. `"text"` / `"image"`) would make the message actionable.

How can I resolve this? If you propose a fix, please make it concise.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Comment on lines +17 to +25
return new AnthropicInferenceClient({
apiKey: "test-key",
textModel: "gpt-4.1-mini",
imageModel: "gpt-4o-mini",
maxOutputTokens: 100,
outputSchema: "structured",
...overrides,
});
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Test fixture uses OpenAI model names for an Anthropic client under test

makeClient() defaults textModel: "gpt-4.1-mini" and imageModel: "gpt-4o-mini". The tests exercise the substitution path (good), but the helper's name and intent could mislead a future contributor into thinking it simulates a misconfigured OpenAI client rather than an Anthropic client with OpenAI-defaulted models. A brief comment explaining why the OpenAI defaults are intentional here would improve clarity.

Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/shared/inference.test.ts
Line: 17-25

Comment:
**Test fixture uses OpenAI model names for an Anthropic client under test**

`makeClient()` defaults `textModel: "gpt-4.1-mini"` and `imageModel: "gpt-4o-mini"`. The tests exercise the substitution path (good), but the helper's name and intent could mislead a future contributor into thinking it simulates a misconfigured OpenAI client rather than an Anthropic client with OpenAI-defaulted models. A brief comment explaining why the OpenAI defaults are intentional here would improve clarity.

How can I resolve this? If you propose a fix, please make it concise.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant