Skip to content

feat(api): forward loader/chunker/extractor/resolver from apply_changes to ingest/update#250

Merged
galshubeli merged 1 commit into
mainfrom
feat/apply-changes-strategy-overrides
May 13, 2026
Merged

feat(api): forward loader/chunker/extractor/resolver from apply_changes to ingest/update#250
galshubeli merged 1 commit into
mainfrom
feat/apply-changes-strategy-overrides

Conversation

@galshubeli
Copy link
Copy Markdown
Collaborator

@galshubeli galshubeli commented May 13, 2026

Summary

apply_changes is the documented single entrypoint for CI-driven incremental ingestion (graph.apply_changes(**parse_git_diff(...)) in the v1.1.0 release notes), but the public signature accepts only added / modified / deleted plus concurrency knobs — there is no way to pass per-call strategy overrides. Internally it dispatches to ingest() and update() with SDK defaults, so any caller who wanted a custom chunker/extractor/resolver had to bypass apply_changes entirely and loop over the primitives themselves.

This PR adds loader=, chunker=, extractor=, resolver= kwargs to apply_changes (and the sync wrapper) and forwards them to the inner ingest() and update() dispatches. delete_document does not take strategies and is unaffected. Defaults are all None — fully backwards compatible.

Motivation

The downstream GraphRAG-UI docs-CI orchestrator builds a per-graph IngestionConfig (GLiNER threshold 0.75, SentenceTokenCapChunking(256, 2), LLMVerifiedResolution) and then calls apply_changes. Without this forwarding, our strategies are silently dropped at the SDK boundary — the orchestrator either has to call the primitives directly (losing apply_changes's batch-level concurrency control and uniform BatchEntry error surface) or accept SDK-default ingestion (silent retrieval quality drift).

Changes

  • apply_changes(*, ..., loader=None, chunker=None, extractor=None, resolver=None, ...) — four new kwargs
  • Inner self.update(path, loader=, chunker=, extractor=, resolver=, ...) in _update_one
  • Inner self.ingest(added, loader=, chunker=, extractor=, resolver=, ...) in the added branch
  • apply_changes_sync mirrors the new signature and forwards
  • Docstring updated under Args: to document the new kwargs

Tests

  • test_strategy_overrides_forward_to_ingest_and_update: caller-passed strategies reach both inner dispatch points
  • test_strategy_overrides_default_to_none: omitted strategies stay None (SDK defaults preserved)
  • All 12 existing TestApplyChanges tests still pass; 114 tests pass across test_facade.py

Test plan

  • pytest tests/test_facade.py — 114 passed locally
  • pytest tests/test_facade.py::TestApplyChanges — 12 passed (10 existing + 2 new)

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Extended apply_changes() and apply_changes_sync() methods to accept optional customization parameters for greater control over data processing operations.
  • Tests

    • Added comprehensive test coverage verifying proper parameter handling in change application workflows and default behavior when parameters are omitted.

Review Change Stack

…es to ingest/update

``apply_changes`` is the documented single entrypoint for CI-driven
incremental ingestion, but in v1.1.0 it silently ignored per-call
strategy overrides — the inner ``ingest()`` and ``update()`` dispatches
ran with SDK defaults regardless. Callers who wanted custom chunker
boundaries, GLiNER thresholds, LLM-verified resolution, or a
non-auto-selected loader had no way to get them through ``apply_changes``;
their only escape was to bypass the convenience wrapper and loop over
``ingest``/``update``/``delete_document`` themselves.

Add ``loader=``, ``chunker=``, ``extractor=``, ``resolver=`` to
``apply_changes`` (and ``apply_changes_sync``) and forward them to the
inner ``ingest()`` and ``update()`` calls. ``delete_document`` doesn't
take strategies and is unaffected. Defaults remain ``None`` — backwards
compatible.

Tests:
- ``test_strategy_overrides_forward_to_ingest_and_update``: caller-passed
  strategies reach both inner dispatch points.
- ``test_strategy_overrides_default_to_none``: omitted strategies stay
  ``None`` (i.e. SDK defaults are preserved).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 13, 2026

📝 Walkthrough

Walkthrough

This PR extends the apply_changes flow to accept optional strategy override parameters (loader, chunker, extractor, resolver) and forward them through to underlying ingest() and update() calls. Both async and sync variants are updated with matching signatures, documentation, and test coverage.

Changes

Strategy override parameters in apply_changes flow

Layer / File(s) Summary
Async apply_changes strategy overrides
graphrag_sdk/src/graphrag_sdk/api/main.py
The apply_changes method signature adds loader, chunker, extractor, and resolver optional parameters. Docstring is updated to describe their behavior. The parameters are forwarded to update(if_missing="ingest") for modified paths and to ingest() for added paths; deleted paths remain unaffected.
Sync apply_changes_sync wrapper
graphrag_sdk/src/graphrag_sdk/api/main.py
The apply_changes_sync synchronous wrapper method receives the same four override parameters in its signature and forwards them to the async apply_changes() call.
Test coverage for strategy parameter forwarding
graphrag_sdk/tests/test_facade.py
Two test cases verify correct behavior: one confirms that explicitly provided strategy parameters are forwarded to both ingest() and update() calls, and another confirms that omitted parameters are forwarded as None to maintain default SDK behavior.

Sequence Diagram

sequenceDiagram
  participant User
  participant ApplyChanges as apply_changes()
  participant Update
  participant Ingest
  
  User->>ApplyChanges: apply_changes(modified, added, loader, chunker, ...)
  ApplyChanges->>Update: update(if_missing="ingest", loader, chunker, extractor, resolver)
  Note over Update: Process modified paths
  ApplyChanges->>Ingest: ingest(loader, chunker, extractor, resolver)
  Note over Ingest: Process added paths
  Ingest-->>ApplyChanges: completed
  Update-->>ApplyChanges: completed
  ApplyChanges-->>User: changes applied
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐰 With strategy tweaks in the apply,
New parameters flow, oh my oh my!
To ingest and update they glide,
Tests verify nothing's denied. 🌟

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: forwarding loader/chunker/extractor/resolver parameters from apply_changes to ingest/update methods.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/apply-changes-strategy-overrides

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@graphrag_sdk/tests/test_facade.py`:
- Around line 1762-1768: The test fakes construct IngestionResult and
UpdateResult with non-canonical kwarg names (e.g., document_id, chunks,
entities, relations, action); update the fake return objects in the functions
that build these results (the IngestionResult call around the earlier fake
ingest and the fake_update function returning UpdateResult, and the similar
fakes at the other occurrences around lines 1795–1801) to use the true/canonical
dataclass field names used by the real API (inspect the IngestionResult and
UpdateResult definitions to find the correct attribute names such as the
document reference, chunk/entity/relation counts, and status/action field) and
set equivalent values so the tests reflect the real object shape.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ca205cde-8386-4c41-bbc3-40c3758faeba

📥 Commits

Reviewing files that changed from the base of the PR and between bfa6271 and ecb185c.

📒 Files selected for processing (2)
  • graphrag_sdk/src/graphrag_sdk/api/main.py
  • graphrag_sdk/tests/test_facade.py

Comment on lines +1762 to +1768
return [IngestionResult(document_id="a.md", chunks=0, entities=0, relations=0)]

async def fake_update(source, **kwargs):
captured["update"] = dict(kwargs, source=source)
return UpdateResult(
document_id="m.md", action="updated", chunks=0, entities=0, relations=0,
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use canonical IngestionResult/UpdateResult fields in test fakes (Line 1762, Line 1767, Line 1795, Line 1800).

These fake return objects are built with non-canonical kwargs (document_id, chunks, entities, relations, action), which can break or make the tests misleading versus the real API shape.

Proposed fix
-        from graphrag_sdk.core.models import IngestionResult, UpdateResult
+        from graphrag_sdk.core.models import DocumentInfo, IngestionResult, UpdateResult
@@
         async def fake_ingest(source, **kwargs):
             captured["ingest"] = dict(kwargs, source=source)
-            return [IngestionResult(document_id="a.md", chunks=0, entities=0, relations=0)]
+            return [
+                IngestionResult(
+                    document_info=DocumentInfo(uid="a.md", path="a.md"),
+                    nodes_created=0,
+                    relationships_created=0,
+                    chunks_indexed=0,
+                    metadata={},
+                )
+            ]
@@
         async def fake_update(source, **kwargs):
             captured["update"] = dict(kwargs, source=source)
             return UpdateResult(
-                document_id="m.md", action="updated", chunks=0, entities=0, relations=0,
+                document_info=DocumentInfo(uid="m.md", path="m.md"),
+                nodes_created=0,
+                relationships_created=0,
+                chunks_indexed=0,
+                metadata={},
+                replaced_existing=True,
+                no_op=False,
             )
@@
-        from graphrag_sdk.core.models import IngestionResult, UpdateResult
+        from graphrag_sdk.core.models import DocumentInfo, IngestionResult, UpdateResult
@@
         async def fake_ingest(source, **kwargs):
             captured["ingest"] = dict(kwargs)
-            return [IngestionResult(document_id="a.md", chunks=0, entities=0, relations=0)]
+            return [
+                IngestionResult(
+                    document_info=DocumentInfo(uid="a.md", path="a.md"),
+                    nodes_created=0,
+                    relationships_created=0,
+                    chunks_indexed=0,
+                    metadata={},
+                )
+            ]
@@
         async def fake_update(source, **kwargs):
             captured["update"] = dict(kwargs)
             return UpdateResult(
-                document_id="m.md", action="updated", chunks=0, entities=0, relations=0,
+                document_info=DocumentInfo(uid="m.md", path="m.md"),
+                nodes_created=0,
+                relationships_created=0,
+                chunks_indexed=0,
+                metadata={},
+                replaced_existing=True,
+                no_op=False,
             )

Also applies to: 1795-1801

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@graphrag_sdk/tests/test_facade.py` around lines 1762 - 1768, The test fakes
construct IngestionResult and UpdateResult with non-canonical kwarg names (e.g.,
document_id, chunks, entities, relations, action); update the fake return
objects in the functions that build these results (the IngestionResult call
around the earlier fake ingest and the fake_update function returning
UpdateResult, and the similar fakes at the other occurrences around lines
1795–1801) to use the true/canonical dataclass field names used by the real API
(inspect the IngestionResult and UpdateResult definitions to find the correct
attribute names such as the document reference, chunk/entity/relation counts,
and status/action field) and set equivalent values so the tests reflect the real
object shape.

@galshubeli galshubeli merged commit 9ba70a5 into main May 13, 2026
10 checks passed
@galshubeli galshubeli deleted the feat/apply-changes-strategy-overrides branch May 13, 2026 13:07
drr00t pushed a commit to drr00t/GraphRAG-SDK that referenced this pull request May 13, 2026
Cuts a release that includes the apply_changes strategy-forwarding
fix merged in FalkorDB#250. v1.1.0 silently dropped per-call ``loader``/
``chunker``/``extractor``/``resolver`` overrides at the apply_changes
boundary; v1.1.1 forwards them through to the inner ingest/update calls.
Default behaviour is unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant