12 Apr 20:43

danielaskdd

5a2694e

v1.4.14 Latest

Latest

What's New

feat(setup): support Atlas Local Docker for Mongo vector storage by @danielaskdd in #2925
perf: batch graph operations in ainsert_custom_kg for large-scale imports by @nszhsl in #2910
examples: add AG2 multi-agent demo with LightRAG retrieval by @faridun-ag2 in #2867

What's Fixed

fix: remove redundant file_path_placeholder lookup in _merge_edges_then_upsert by @jwchmodx in #2877
chore: remove dead config.ini / configparser code by @jwchmodx in #2887
chore: remove dead OLLAMA_NUM_CTX / args.ollama_num_ctx assignment by @jwchmodx in #2888
fix(webui): resolve all bun run lint errors in lightrag_webui by @danielaskdd in #2891
chore(webui): migrate ESLint stylistic plugin by @danielaskdd in #2893
docs(readme): restructure documentation and consolidate core api by @danielaskdd in #2896
fix pipeline status history trimming by @danielaskdd in #2897
fix: Corrected exception handling for LLM API timeouts where the subclass incorrectly passes keyword args by @hillct in #2902
Improve LLM API failure diagnostics by @danielaskdd in #2903
docs: deprecate config.ini in documentation by @danielaskdd in #2905
fix(auth): prevent JWT algorithm confusion attack (GHSA-8ffj-4hx4-9pgf) by @danielaskdd in #2907
fix(api): add missing metadata field in /query/data error response by @lawrence3699 in #2923
fix(utils): prevent remove_think_tags from truncating responses with embedded tags by @sjhddh in #2900
fix(opensearch): ensure consistent by lazy index refresh and real-time edge lookups by @danielaskdd in #2926
fix(kg): correct omission of isolated nodes in get_knowledge_graph during full graph retrieval by @danielaskdd in #2928

New Contributors

@jwchmodx made their first contribution in #2877
@hillct made their first contribution in #2902
@faridun-ag2 made their first contribution in #2867
@lawrence3699 made their first contribution in #2923
@sjhddh made their first contribution in #2900
@nszhsl made their first contribution in #2910

Full Changelog: v1.4.13...v1.4.14

Contributors

hillct, nszhsl, and 5 other contributors

Assets 2

02 Apr 17:00

danielaskdd

v1.4.13

f096487

v1.4.13

What's New

Feat: add make dev bootstrap target by @danielaskdd in #2870
Feat: Add PostgreSQL performance timing instrumentation by @danielaskdd in #2855

What's Changed

perf(storage): add cooperative yielding to prevent event loop blocking by @danielaskdd in #2847
fix: sanitize entity_type in Memgraph upsert_node to prevent Cypher injection (CWE-89) by @sebastiondev in #2849
perf(doc-status): add get_docs_by_statuses to all backends and fix PG pool/pagination bugs by @danielaskdd in #2853
Fix setup .env regeneration for preserved custom variables by @danielaskdd in #2854
Fix missing file_path in entity merge upserts by @danielaskdd in #2857
fix(memgraph): preserve start node in knowledge graph query by @danielaskdd in #2868
fix(auth): reject default JWT secret when AUTH_ACCOUNTS is configured by @danielaskdd in #2869
fix(postgres): handle quoted AGE entity ids in edge retrieval by @danielaskdd in #2872

New Contributors

@sebastiondev made their first contribution in #2849

Full Changelog: v1.4.12...v1.4.13

Contributors

danielaskdd and sebastiondev

Assets 2

27 Mar 05:52

danielaskdd

v1.4.12

ffd9114

v1.4.12

Hot Fixes

Optimize Postgres Vector DB upsert performance by increasing the batch size to 200 records per operation.
Resolved issue with opening PDF files protected by permission-only restrictions
Fixed pipeline cancellation causing filenames to be incorrectly changed to 'unknown_source'

What's Changed

feat(auth): add bcrypt-prefixed password hashing support by @danielaskdd in #2813
⚡ Bolt: Optimize text sanitization in utils.py by @danielaskdd in #2814
Fix loginToServer content type and missing grant_type by @danielaskdd in #2821
Postgress/pgvector backend now allows index large embeddings with HALFVEC by @Daggle24 in #2663
fix: make document deletion retry-safe by @danielaskdd in #2826
fix(pdf): handle permission-only encrypted PDFs without password by @danielaskdd in #2827
Add driver info to AsyncMongoClient instantiation by @alexbevi in #2834

New Contributors

@Daggle24 made their first contribution in #2663
@alexbevi made their first contribution in #2834

Full Changelog: v1.4.11...v1.4.12rc1

Contributors

alexbevi, danielaskdd, and Daggle24

Assets 2

20 Mar 08:41

danielaskdd

v1.4.11

6d120e7

v1.4.11

Important Notes

Integrated OpenSearch as a unified storage backend, providing comprehensive support for all four LightRAG storage types: KV, Vector, Graph, and DocStatus.
Introduced an interactive setup wizard to streamline configuration, replacing manual .env file editing. Support for local deployment of embedding, reranking, and storage backends via Docker Compose is now available. For further details, please refer to Interactive Setup Guide.

What's New

Add OpenSearch as unified storage backend by @LantaoJin in #2739
Add OpenSearchKVStorage support to LLM cache tools by @danielaskdd in #2790
Add Makefile for quick deployment by @mlimarenko in #2548
Refactor(Makefile): split monolithic wizard into modular env-base/storage/server targets by @danielaskdd in #2763
Add OpenSearch storage configuration support to the interactive setup wizard by @LantaoJin in #2797
perf(postgres): optimize KV storage upsert using executemany by @wkpark in #2742

What's Fixed

Fix Qdrant large upsert payload failures with bounded batching by @danielaskdd in #2740
build(deps): bump the github-actions group with 2 updates by @dependabot[bot] in #2737
perf: use deque for BFS queue in get_knowledge_subgraph() by @giulio-leone in #2725
perf: batch pre-compute query embeddings to eliminate sequential API round-trips by @errajibadr in #2729
fix: reduce FaissVectorDBStorage meta.json file size by excluding vectors by @Br1an67 in #2733
Enhance current MilvusVectorDBStorage with parameterized configuration by @hanlianlu in #2672
fix: preserve failed-doc chunk metadata for reliable deletion cleanup by @danielaskdd in #2749
fix: align zhipu adapter with official thinking and dimensions api by @ChenJiahao1 in #2775
fix(operate,utils): correct typos in log messages and remove dead code by @lailoo in #2781
perf(opensearch): Remove refresh="wait_for" from OpenSearch storage backends by @LantaoJin in #2786
fix(api): sanitize workspace from CLI args and HTTP headers to prevent injection by @danielaskdd in #2792
fix(api): normalize missing document file paths by @danielaskdd in #2793
fix: prevent None file_path from propagating as unknown_source by @he-yufeng in #2796

New Contributors

@giulio-leone made their first contribution in #2725
@errajibadr made their first contribution in #2729
@Br1an67 made their first contribution in #2733
@hanlianlu made their first contribution in #2672
@LantaoJin made their first contribution in #2739
@ChenJiahao1 made their first contribution in #2775
@lailoo made their first contribution in #2781
@he-yufeng made their first contribution in #2796

Full Changelog: v1.4.10...v1.4.11

Contributors

wkpark, LantaoJin, and 10 other contributors

Assets 2

13 Mar 06:19

danielaskdd

v1.4.11rc2

7d53da4

v1.4.11rc2

What's New

Add Makefile for quick deployment by @mlimarenko in #2548
Refactor(Makefile): split monolithic wizard into modular env-base/storage/server targets by @danielaskdd in #2763

For detail information about setup wizard, pls refer to: InteractiveSetup.md

What's Changed

Fix Qdrant large upsert payload failures with bounded batching by @danielaskdd in #2740
perf: use deque for BFS queue in get_knowledge_subgraph() by @giulio-leone in #2725
perf: batch pre-compute query embeddings to eliminate sequential API round-trips by @errajibadr in #2729
fix: reduce FaissVectorDBStorage meta.json file size by excluding vectors by @Br1an67 in #2733
Enhance current MilvusVectorDBStorage with parameterized configuration by @hanlianlu in #2672
fix: preserve failed-doc chunk metadata for reliable deletion cleanup by @danielaskdd in #2749
build(deps-dev): bump the ui-components group in /lightrag_webui with 2 updates by @dependabot[bot] in #2750
build(deps): bump the frontend-minor-patch group in /lightrag_webui with 3 updates by @dependabot[bot] in #2751
build(deps): bump the github-actions group with 4 updates by @dependabot[bot] in #2759

New Contributors

@giulio-leone made their first contribution in #2725
@errajibadr made their first contribution in #2729
@Br1an67 made their first contribution in #2733
@hanlianlu made their first contribution in #2672

Full Changelog: v1.4.10...v1.4.11rc2

Contributors

giulio-leone, hanlianlu, and 5 other contributors

Assets 2

27 Feb 07:39

danielaskdd

v1.4.10

6286eb7

v1.4.10

What's New

feat: Add POSTGRES_ENABLE_VECTOR option to conditionally disable pgvector extension by @StoreksFeed in #2683
Add i18n support for Vietnamese by @zAcherttp in #2708

What's Changed

Fix: Content Duplicate Detection for Document Upload Now Trackable by @danielaskdd in #2591
Add Claude Code GitHub Workflow by @danielaskdd in #2601
Fix/anthropic api compatibility by @skogsbaeck in #2603
add support for C/C++ header files by @Mjemec in #2614
Add LightRAG workspace management demo script by @vishvaRam in #2615
Enhance README with usage example for workspaces by @vishvaRam in #2618
feat(api): Add async streaming file upload with configurable size limit by @danielaskdd in #2622
Update installation instructions in README by @Krytos in #2624
Update Litewrite Link by @LarFii in #2628
fix: Add default value for max_file_paths to prevent TypeError by @danielaskdd in #2641
Fix: Add MAX_EXTRACT_INPUT_TOKENS to prevent gleaning context overflow (#2472) by @Odin233 in #2630
fix: correct typos 'seperator', 'descpriton', and 'seperate' by @thecaptain789 in #2685
Validate description fields in graph CRUD paths by @danielaskdd in #2706
fix: pass embedding_dim to Azure OpenAI embedding API by @danielaskdd in #2721
fix: use WindowsSelectorEventLoopPolicy on Windows to fix server port… by @Sampriti2803 in #2704
fix: sanitize comma-separated entity types to prevent Neo4j CypherSyntaxError by @danielaskdd in #2722
fix(postgres): make PGVectorStorage table/index creation idempotent (fixes #2702) by @danielaskdd in #2723
fix(webui): make build runtime-agnostic by fixing Bun-only imports in… by @Pranavh-2004 in #2703
fix(webui): wrap ReactMarkdown with div to fix className prop crash in dev mode by @danielaskdd in #2724

New Contributors

@skogsbaeck made their first contribution in #2603
@Mjemec made their first contribution in #2614
@Krytos made their first contribution in #2624
@Odin233 made their first contribution in #2630
@thecaptain789 made their first contribution in #2685
@StoreksFeed made their first contribution in #2683
@Sampriti2803 made their first contribution in #2704
@zAcherttp made their first contribution in #2708
@Pranavh-2004 made their first contribution in #2703

Full Changelog: v1.4.9.11...v1.4.10

Contributors

Krytos, skogsbaeck, and 10 other contributors

Assets 2

15 Jan 18:27

danielaskdd

v1.4.9.11

411b1ce

v1.4.9.11

Hot Fixed

Fix OpenAI LLM binding options not loaded from environment variables by @danielaskdd in #2585

What's New

feat(gemini): Add Vertex AI support for Gemini LLM binding by @danielaskdd in #2529
refact(gemini): Migrate Gemini LLM to native async Google GenAI client by @danielaskdd in #2531
Refact: Change DOCX extraction to use HTML tags for whitespace by @danielaskdd in #2550
feat: add Korean localization by @jhchoi1182 in #2571
Add support for mdx file type by @coldfire-x in #2566
Add i18n support for German, Ukrainian, Russian, and Japanese languages by @mlimarenko in #2547

What's Fixed

docs: fix the simple program rag init function return value in README.md by @Peefy in #2532
docs: fix the simple program rag init function return value in README-zh.md by @Peefy in #2534
feat: Implement WebUI Token Auto-Renewal (Sliding Window Expiration) by @danielaskdd in #2543
Fixes the Gemini integration example in the README by @vishvaRam in #2537
Add Gemini demo for LightRAG by @vishvaRam in #2538
Add LightRAG demo with PostgreSQL and Gemini integration by @vishvaRam in #2556
Update PostgreSQL demo script reference in README.md by @vishvaRam in #2557
Fix: Enhance PostgreSQL Reconnection Tolerance for HA Deployments by @danielaskdd in #2562
Add NEO4J_DATABASE variable to README by @vishvaRam in #2578
Bump the frontend-minor-patch group in /lightrag_webui with 2 updates by @dependabot[bot] in #2577
Add LightRAG demo script with vLLM integration by @vishvaRam in #2582

New Contributors

@Peefy made their first contribution in #2532
@vishvaRam made their first contribution in #2537
@mlimarenko made their first contribution in #2547
@jhchoi1182 made their first contribution in #2571
@coldfire-x made their first contribution in #2566

Full Changelog: v1.4.9.10...v1.4.9.11

Contributors

coldfire-x, Peefy, and 5 other contributors

Assets 2

23 Dec 01:10

danielaskdd

v1.4.9.10

8c8186a

v1.4.9.10

What's Changed

Hot Fix AttributeError in Neo4JStorage and MemgraphStorage when using storage specified workspace env var by @danielaskdd in #2526

Full Changelog: v1.4.9.9...v1.4.9.10

Contributors

danielaskdd

Assets 2

22 Dec 17:49

danielaskdd

v1.4.9.9

dccf1ef

v1.4.9.9

Release Note V1.4.9.9

Important Notes

Add Workspace Isolation for Pipeline Status and In-memory Storage: Multiple LightRAG instances with distinct workspaces can be created simultaneously, marking a significant advancement toward seamless workspace switching within a single LightRAG server.
Add Workspace Vector Data Isolation by Model Name and Dimension for PostgreSQL and Qdrant: Previously, LightRAG used a single collection/table for difference embedding model and dimension, which caused dimension mismatch crashes or data pollution in multi-workspace.
Dimension Selection is Supported for OpenAI and Gemini Embedding model with new env var introduced: EMBEDDING_SEND_DIM
Add LLM Cache Migration and LLM Query Cache Cleanup Tools Between Different KV Storage: enabling users to switch storage backends without losing cached extraction and summary data.
Enhanced Enhanced DOCX Extraction with Table Content Support.
Enhanced XLSX extraction with proper handling of tab and newline characters within cells.
Fix Critical Security Vulnerability in React Server Components: #2494
Add Automatic Text Truncation Support for Embedding Functions: OpenAI embedding function now respect max_token_size value in EmbeddingFunc, and automatic truncate input text to prevent API errors caused by texts exceeding model token limits.

What's Breaking (for LightRAG Core integration only)

Rename params of chunking function: If you incorporate the chunking function into LightRAG and pass parameters by name, corresponding code updates are required.

def chunking_by_token_size(
    tokenizer: Tokenizer,
    content: str,
    split_by_character: str | None = None,
    split_by_character_only: bool = False,
    chunk_overlap_token_size: int = 100,
    chunk_token_size: int = 1200,
) -> list[dict[str, Any]]:

Inject an embedding_func with model_name to LightRAG using wrap_embedding_func_with_attrs:

@wrap_embedding_func_with_attrs(
    embedding_dim=1536, max_token_size=8192, model_name="text-embedding-3-small"
)
@retry(
    stop=stop_after_attempt(3),
    wait=wait_exponential(multiplier=1, min=4, max=60),
    retry=(
        retry_if_exception_type(RateLimitError)
        | retry_if_exception_type(APIConnectionError)
        | retry_if_exception_type(APITimeoutError)
    ),
)
async def embedding_func(texts: list[str]) -> np.ndarray:
    client = AzureOpenAI(
        api_key=AZURE_OPENAI_API_KEY,
        api_version=AZURE_EMBEDDING_API_VERSION,
        azure_endpoint=AZURE_OPENAI_ENDPOINT,
    )
    embedding = client.embeddings.create(model=AZURE_EMBEDDING_DEPLOYMENT, input=texts)

    embeddings = [item.embedding for item in embedding.data]
    return np.array(embeddings)

rag = LightRAG(
      working_dir=WORKING_DIR,
      llm_model_func=llm_model_func,
      embedding_func=embedding_func,
)

To ensure seamless transition, legacy code injecting embedding_func without model_name will continue to interface with the original non-suffixed vector tables.

What's New

Feat: Add Chain of Thought Support for Gemini LLM by @danielaskdd in #2326
Feat: Add Optional Embedding Dimension Control with OpenAI API by @danielaskdd in #2328
Feat: Add Gemini Embedding Support to LightRAG by @danielaskdd in #2329
Feat: Add LLM Cache Migration Tool by @danielaskdd in #2330
Feat: Add LLM Query Cache Cleanup Tool by @danielaskdd in #2335
Support async chunking func to improve processing performance when a heavy chunking_func is passed in by user by @tongda in #2336
Add ollama cloud support by @LacombeLouis in #2348
Feat: Add Workspace Isolation for Pipeline Status and In-memory Storage by @danielaskdd in #2369
feat: add vchordrq vector index support for PostgreSQL by @wmsnp in #2378
Feat: Enhanced DOCX Extraction with Table Content Support by @danielaskdd in #2383
Feat: Enhance XLSX Extraction by Adding Separators and Escape Special Characters by @danielaskdd in #2386
Optimize for OpenAI Prompt Caching: Restructure entity extraction pro… by @Ghazi-raad in #2426
feat: Vector Storage Model Isolation with Automatic Migration by @BukeLy in #2391
feat: Implement Vector Database Model Isolation and Auto-Migration by @danielaskdd in #2513
feat: Add Automatic Text Truncation Support for Embedding Functions by @danielaskdd in #2523

What's Changed

Fix: Remove Duplicate Entity/Realtion Tracking Deletion in adelete_by_doc_id by @danielaskdd in #2322
Fix spelling errors in the "使用PostgreSQL存储" section of README-zh.md by @huangbhan in #2327
Add dimensions parameter support to openai_embed() by @yrangana in #2323
Fix Gemini driver retry mechanism by @danielaskdd in #2331
HotFix: Restore OpenAI Streaming Response & Refactor keyword_extraction Parameter by @danielaskdd in #2334
Refactor: Migrate PDF processing dependency from pypdf2 to actively pypdf by @danielaskdd in #2338
Fix: Prevent UnicodeEncodeError in JSON storage operations by @danielaskdd in #2344
Remove deprecated response_type parameter from query settings UI by @danielaskdd in #2345
Refactor: Optimize write_json for Memory Efficiency and Performance by @danielaskdd in #2346
Refact: Remove blocking dependency installation from document upload handlers by @danielaskdd in #2350
Refact: Implement Lazy Configuration Initialization for API Server by @danielaskdd in #2351
Refact: Enhance DOCLING integration with lazy loading and macOS safeguards by @danielaskdd in #2352
Fix: Robust error handling for async database operations in graph storage by @danielaskdd in #2356
Update the value corresponding to the extracted entity relationship keywords by @sleeepyin in #2358
Add macOS fork safety check for Gunicorn multi-worker mode by @danielaskdd in #2360
Refact: Add Embedding Token Limit Configuration and Improve Error Handling by @danielaskdd in #2359
Refact: Add Embedding Dimension Validation in EmbeddingFunc by @danielaskdd in #2368
test: Convert test_workspace_isolation.py to pytest style by @BukeLy in #2371
refactor(chunking): rename params and improve docstring for chunking by @EightyOliveira in #2379
Fix: Add chunk token limit validation with detailed error reporting by @danielaskdd in #2389
Fix: Remove redundant exception logging to eliminate pytest shutdown errors by @danielaskdd in #2390
issue-2394: use deployment variable instead of model for embeddings API call by @Amrit75 in #2395
Refactor: Centralize keyword_extraction parameter handling in OpenAI LLM implementations by @danielaskdd in #2401
Refact: Consolidate Azure OpenAI and OpenAI implementations by @danielaskdd in #2403
Update README.md by @chaohuang-ai in #2408
Update README.md by @chaohuang-ai in #2409
feat: create copilot-setup-steps.yml by @netbrah in #2410
Fix: Add Comprehensive Retry Mechanism for Neo4j Storage Operations by @danielaskdd in #2417
Refact: Allow API Server to Start Without Built WebUI Assets by @danielaskdd in #2418
fix:exception handling order error by @EightyOliveira in #2421
Doc: Update README examples to prevent double-wrapping of embedding functions by @danielaskdd in #2432
Fix: Add configurable model support for Jina embedding by @danielaskdd in #2433
Fix typos discovered by codespell by @cclauss in #2434
Update README.md by @chaohuang-ai in #2439
Fix KaTeX chemistry formula rendering (\ce command) not working by @danielaskdd in #2443
fix(postgres): Add CASCADE to AGE extension creation for automatic dependency resolution by @danielaskdd in #2446
Add Python 3.13 and 3.14 to the testing by @cclauss in #2436
Keep GitHub Actions up to date with GitHub's Dependabot by @cclauss in #2435
chore: optimize Dependabot configuration with dependency grouping and PR limits by @danielaskdd in #2447
...

Contributors

mccahill, tongda, and 15 other contributors

Assets 2

06 Nov 13:51

danielaskdd

v1.4.9.8

5bcd292

v1.4.9.8

What's New

Feat: Add PDF Decryption Support for Password-Protected Files by @danielaskdd in #2296
Feat: Add optional Langfuse observability integration by @anouar-bm in #2298
Feat: Add RAGAS evaluation framework for RAG quality assessment by @anouar-bm in #2297
Feat: Add native gemini LLM support by @Humphryshikunzi in #2305

What's Changed

Refact: Auto-refresh of Popular Labels When Pipeline Completes by @danielaskdd in #2291
Fix empty context validation bug and improve naming consistency in query context building by @danielaskdd in #2295
Refact: Enhanced RAG Evaluation CLI with Two-Stage Pipeline and Improved UX by @danielaskdd in #2311
Refact: Separate Configuration of RAGAS for LLM and Embeddings by @danielaskdd in #2314
Refactor: Remove Deprecated Chunk-Based Query Methods and Improve Graph Unit Test by @danielaskdd in #2319
Fix node retrieval fail with special characters in IDs for Postgres AGE GraphStorage by @danielaskdd in #2320
Fix performance bottleneck in document deletion by @danielaskdd in #2321

New Contributors

@anouar-bm made their first contribution in #2298
@Humphryshikunzi made their first contribution in #2305

Full Changelog: v1.4.9.7...v1.4.9.8

Contributors

danielaskdd, Humphryshikunzi, and anouar-bm

Assets 2

Releases: HKUDS/LightRAG

v1.4.14

What's New

What's Fixed

New Contributors

Contributors

Uh oh!

v1.4.13

What's New

What's Changed

New Contributors

Contributors

Uh oh!

v1.4.12

Hot Fixes

What's Changed

New Contributors

Contributors

Uh oh!

v1.4.11

Important Notes

What's New

What's Fixed

New Contributors

Contributors

Uh oh!

v1.4.11rc2

What's New

What's Changed

New Contributors

Contributors

Uh oh!

v1.4.10

What's New

What's Changed

New Contributors

Contributors

Uh oh!

v1.4.9.11

Hot Fixed

What's New

What's Fixed

New Contributors

Contributors

Uh oh!

v1.4.9.10

What's Changed

Contributors

Uh oh!

v1.4.9.9

Release Note V1.4.9.9

Important Notes

What's Breaking (for LightRAG Core integration only)

What's New

What's Changed

Contributors

Uh oh!

v1.4.9.8

What's New

What's Changed

New Contributors

Contributors

Uh oh!