Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Changelog

All notable changes to GraphRAG-RS will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

[Unreleased]

Security

CI green: cargo-deny advisories/licenses + rustfmt (2026-05-31)

  • Vulnerabilities patched via lockfile bumps: rand 0.8.5→0.8.6 and 0.9.2→0.9.4 (RUSTSEC-2026-0097 unsoundness), bytes 1.10.1→1.11.1 (RUSTSEC-2026-0007 integer overflow), rustls-webpki 0.103.7→0.103.13 (RUSTSEC-2026-0049/0098/0099/0104 — CRL + name-constraint vulns). All patch-level, non-breaking.
  • deny.toml licenses: added BSL-1.0 (Boost) and CDLA-Permissive-2.0 (Mozilla CA bundle via webpki-roots) to the allow-list — both permissive, were failing the licenses job.
  • deny.toml advisory ignores (unfixable here, documented inline): unmaintained transitive crates proc-macro-error, bincode, json, number_prefix, paste, rustls-pemfile; lru 0.12 unsoundness (RUSTSEC-2026-0002, pinned by ratatui 0.29, unreachable in our usage); and time DoS (RUSTSEC-2026-0009) — its fix (≥0.3.47) requires rustc 1.88, above our MSRV 1.85, so time is held at 0.3.44 and the advisory accepted (reachable only via untrusted RFC-2822 parsing in the server, not core/cli). Revisit when MSRV moves to ≥1.88.
  • Formatting: ran cargo fmt --all over the workspace (71 files) to clear the long-standing rustfmt CI job. Mechanical, no behavior change.
  • --all-features advisory/license coverage: the cargo-deny-action defaults to --all-features, so CI also scans the optional lancedb tree (lance/datafusion/arrow). Patched lz4_flex 0.11.5→0.11.6 / 0.12.0→0.12.2 (RUSTSEC-2026-0041) and tar 0.4.44→0.4.46 (RUSTSEC-2026-0067/0068); allowed 0BSD (mock_instant). Added [graph] all-features = true to deny.toml so local cargo deny check sees the same graph as CI (prevents local≠CI drift).
  • CI SIGILL fix: set RUSTFLAGS = "-C target-cpu=x86-64-v2" in ci.yml to override the repo’s .cargo/config.toml -C target-cpu=native. On GitHub’s heterogeneous runners native can emit instructions the silicon traps (SIGILL crashing rustc/proc-macros, seen building ollama-rs). Verified the rustc invocation: an empty CARGO_BUILD_RUSTFLAGS is ignored and doesn’t override the config flag — only a non-empty RUSTFLAGS (highest precedence) fully replaces it. Local dev keeps target-cpu=native; CI uses the portable x86-64-v2 baseline.

Added

Documentation site (2026-05-31)

  • mdBook documentation site under book/, deployed to GitHub Pages at https://automataia.github.io/graphrag-rs/. Curated, English-only, user-facing TOC (book/src/SUMMARY.md) covering getting-started, concepts, configuration, features, and per-crate guides. Internal dev reports and Italian guides are intentionally excluded.
  • Chapters are thin {{#include}} wrappers over the canonical sources (HOW_IT_WORKS.md, crate READMEs, curated docs/*.md) so there is a single source of truth and no content drift. Front-door pages (introduction.md, getting-started/overview.md, quickstart.md) are authored.
  • Mermaid diagrams render via the mdbook-mermaid preprocessor; built-in client-side search enabled.
  • API reference links out to docs.rs/graphrag-core rather than self-hosting cargo doc.
  • New CI workflow .github/workflows/docs.yml builds the book (pinned mdbook 0.5.3 + mdbook-mermaid 0.17.0 prebuilt binaries) and deploys via actions/deploy-pages. The generated book/book/ output is git-ignored. Manual one-time step: set repo Settings → Pages → Source = “GitHub Actions”.
  • README: added a docs-site badge.
  • Translated to English the doc sources the site includes that still contained Italian: docs/INCREMENTAL_UPDATES.md, docs/TUI_USAGE_GUIDE.md, docs/ENRICHMENT_USAGE_GUIDE.md, docs/SUMMARIZATION_CONFIG.md, the graphrag-cli/README.md config table notes, and the Italian entries in this CHANGELOG. Fixed stale repo URLs (anthropics/*automataIA/graphrag-rs) in the translated guides. The public site is now English-only end to end.
  • Stripped decorative/pictographic emoji (the 📚🚀📖 family) from the doc sources the site includes, fixing “tofu” boxes that appeared wherever the viewer’s font lacked an emoji glyph (mdBook’s default theme has no emoji-font fallback — a generic missing-glyph issue, not a bug). Preserved arrows (→), box-drawing/ASCII diagrams (━│▼█), and data symbols (✅❌★☆); converted rating ⭐→★ to keep ratings rendering. Keycap-numbered headings (1./2.) replaced the 1️⃣ style.

[0.2.0] - 2026-05-31

Fixed

  • arrow workspace dep: added default-features = false to arrow = "57" in the workspace Cargo.toml. Previously, the default-features = false directive in graphrag-core/Cargo.toml was silently ignored by Cargo (build-time warning).
  • documentation metadata for the graphrag crate: added documentation = "https://docs.rs/graphrag" in graphrag/Cargo.toml, aligning the wrapper crate with graphrag-core and graphrag-cli.

Code/architecture/product quality audit (2026-05-30)

Added

  • CI/CD: new workflow .github/workflows/ci.yml. The repo previously had no CI automation. Blocking jobs: clippy --workspace --lib -D warnings (now green, see below), test -p graphrag-core --lib, cargo-deny. The fmt job is informational and non-blocking (continue-on-error) until the repo is made cargo fmt --all clean (pre-existing repo-wide formatting debt).
  • Security tooling: deny.toml (advisories + permissive licenses + duplicate ban) and SECURITY.md (private disclosure policy via GitHub Security Advisories).
  • Drift-guard tests (config/setconfig.rs): gliner_setconfig_default_matches_runtime and autosave_setconfig_default_matches_runtime fail at build time if the serde leaf-struct defaults diverge from the canonical runtime ones, preventing “5-point-sync” drift. OllamaConfig is excluded on purpose (by-design divergence: offline-first runtime vs user-facing schema).
  • Crate metadata: documentation (docs.rs) and readme fields added to graphrag-core and graphrag-cli for publishing on crates.io.

Documentation polish (2026-05-30)

  • graphrag/README.md: the wrapper meta-crate had no README (only Cargo.toml
    • src). Added: explains that it re-exports graphrag-core and provides the graphrag binary, with a binary quick-start + library usage and links to the core/root README.
  • Module //! headers added to the 10 graphrag-core modules that lacked them (previously starting with use/pub mod/#[cfg] or a /// on the first submodule): config, graph, generation, critic, retrieval, summarization, vector, entity, text, query. Every module’s rustdoc page now shows a description. Doc-comments only, no behavior change; clippy -p graphrag-core -D warnings stays green and cargo doc introduces no new warnings.

PageRank: score normalization (dangling nodes) (2026-05-30)

  • Bug fix: scores_to_entity_map in graph/pagerank.rs now L1-normalizes the scores (sum = 1.0). Dangling nodes (no outgoing edges) lost rank mass on every iteration, leaving the sum < 1.0. Single fix point → covers all paths (dense/parallel/sparse). Unblocks 3 previously-failing tests: test_pagerank_convergence, test_personalized_pagerank, test_precompute_global_pagerank (visible only under the pagerank feature, activated by --workspace feature-unification).

Swagger UI served at /swagger (2026-05-30)

  • graphrag-server: the Swagger UI was announced but not served (“coming soon”). Now exposed at /swagger via apistos’s native support (features = ["swagger-ui"], already enabled) — apistos-swagger-ui bundles the official Swagger UI assets, so no new dependency. Changed .build("/openapi.json").build_with(..., BuildConfig::default().with(SwaggerUIConfig::new(&"/swagger"))) in main.rs. README updated (removed “coming soon”).

Clean clippy on examples/tests + green doctests (2026-05-30)

  • Clippy examples/tests: cargo clippy --examples --tests -p graphrag-core -- -D warnings is now green. Bulk via cargo clippy --fix; manual tail: ///!//! (embeddings demo), .filter().next_back().rfind(), .clone() on a double ref → .iter().copied(), ignored let _ = on Result, std::slice::from_ref, removal of unused vars.
  • Doctest: cargo test --doc -p graphrag-core → 47 pass / 0 fail / 17 ignored. 7 illustrative, non-self-contained examples (require a live Ollama, an async runtime, or undefined setup variables — core::ChunkingStrategy, build_relationship_hierarchy, KV-cache Ollama, pipeline_executor, etc.) marked ```ignore. The hero example still runs and is green.
  • clippy --fix regression corrected: config/enhancements.rs:770--fix had removed mut from let count, seeing it as inactive under default features; restored let mut count with #[allow(unused_mut)] (the count += 1s are behind #[cfg(feature = ...)]).

Stale examples/tests recompile (2026-05-30)

  • Stale struct initializers: added the missing temporal/causal fields (all None) to the Entity literals (first_mentioned, last_mentioned, temporal_validity) and Relationship literals (embedding, temporal_type, temporal_range, causal_strength) in the llm_evaluation_demo, advanced_nlp_demo, hierarchical_graphrag_demo, workspace_demo, tom_sawyer_workspace examples. They had fallen behind the evolution of Entity/Relationship in core/mod.rs (Phase 1.2) and broke cargo build --examples.
  • complete_zero_cost_graphrag_demo: Config literal closed with ..Default::default() (it was missing advanced_features, gliner, suppress_progress_bars) and the EntityConfig literal completed with use_atomic_facts: false + max_fact_tokens: 400.
  • Per-feature gating (graphrag-core Cargo.toml): hierarchical_graphrag_demo now required-features = ["leiden"] (uses LeidenConfig / detect_hierarchical_communities, #[cfg(feature = "leiden")]) and the incremental_integration test required-features = ["incremental"] (it imported graphrag_core::incremental). So a default cargo build/test --workspace stays green without pulling in the optional features.
  • Chat discussion.html: added the standard line-clamp:3 property alongside -webkit-line-clamp (CSS vendorPrefix linter).
  • Verification: cargo build --examples --tests --workspace → clean Finished; cargo test -p graphrag-core --lib → 365 pass / 0 fail. The 3 pagerank tests that fail under --workspace feature-unification are pre-existing (confirmed on a clean tree).

Changed

  • Dependency dedup (anti-bloat): aligned two direct workspace dependencies to versions already present transitively, eliminating duplicate versions in graphrag-cli’s -e normal tree:
    • strum 0.25 → 0.26 (matches ratatui 0.29) — removes duplicate strum + strum_macros.
    • itertools 0.12 → 0.13 (matches ratatui/unicode-truncate).
    • Real duplicates in graphrag-cli’s normal tree dropped from 34 to 26. Verified that graphrag-core (the published crate) has only 4 unavoidable transitive duplicates (getrandom 0.2/0.3, webpki-roots 0.26/1.0, TLS stack). rand 0.8→0.9 NOT done (API-breaking, only deduplicated the unpublished server binary).

Fixed

  • CLI crash at startup on all non-TUI subcommands (index, ask, bench, setup, validate, …): color_eyre::install() was called twice — in graphrag-cli/src/main.rs:10 and again inside run() at lib.rs:197 — and the second install aborted with “could not set the provided Theme globally as another was already set”. Removed the duplicate install() from main.rs; now both binaries (graphrag-cli and the graphrag meta-crate, which doesn’t install on its own) install exactly once via run(). Caught by running the e2e benchmarks (bench).
  • MSRV corrected and verified: rust-version changed from 1.75 (false, never tested) to 1.85. The real floor is imposed by the direct dependency jsonfixer, which uses edition = "2024" (requires rustc ≥ 1.85). Build-verified on the 1.85 toolchain for graphrag-core and graphrag-cli. New msrv CI job that builds on 1.85. Analysis method: floor from cargo metadata (max rust_version declared among the normal deps) + build verification on a single toolchain (no costly bisect).
  • Lint debt zeroed (green workspace clippy): resolved 38 pre-existing clippy errors that surfaced under cargo clippy --workspace --lib -- -D warnings (Rust 1.95). Diagnosis: graphrag-core in isolation (default features) was already clean; the errors were in core’s optional modules (incremental, rograg, lightrag, embeddings/ollama) activated by the cli/server features + 3 errors of graphrag-cli’s own. Idiomatic fixes (to_vec(), iter_mut().enumerate(), if let Some, sort_by_key(Reverse(..)), type aliases NodeDeltaResult/ EdgeDeltaResult) and targeted, commented #[allow]s where a rename would break the serde API (PendingUpdateType) or for a private 10-argument helper. Not an interface break: the crates compile and link correctly.
  • GLiNER default drift: default_gliner_entity_labels/default_gliner_relation_labels in config/setconfig.rs were misaligned with the runtime GlinerConfig::default() (missing "concept" and "causes"). Now aligned with the canonical default (4 entity + 3 relation labels). Not observable in the existing e2e configs (they set the labels explicitly); relevant only when GLiNER is enabled via TOML while omitting the labels.

Documentation

  • Markdown doc consolidation (few but useful): reduced the ~55 tracked .md files to a keystone set. Deleted 39 files among process artifacts (report.md, TODO.md, *_COMPLETE.md, *_SUMMARY.md, *_STATUS.md, MERGE_COMPLETE.md, IMPLEMENTATION_SUMMARY.md) and satellite integration guides now covered by the keystones (graphrag-core/{ADVANCED_FEATURES,OLLAMA_INTEGRATION,LEIDEN_INTEGRATION,LIGHTRAG_INTEGRATION, HIPPORAG_INTEGRATION,CROSS_ENCODER_INTEGRATION,ENTITY_EXTRACTION,EMBEDDINGS_CONFIG, PIPELINE_ARCHITECTURE,QUICKSTART,ENRICHMENT_IMPLEMENTATION,WORKSPACE_PERSISTENCE_SUMMARY}.md, the src/{embeddings/README,graph/TRAVERSAL_GUIDE}.md, the entire series of non-README graphrag-wasm/*.md guides, examples/MULTI_DOCUMENT_PIPELINE.md). The surviving keystones: README.md, HOW_IT_WORKS.md, CHANGELOG.md, the 4 crate READMEs, config/JSON5_CONFIG_GUIDE.md. The docs/ folder is git-ignored (local notes) and is not touched.
  • Keystone staleness fixes: MSRV badge/prerequisites 1.701.85 in the root README; removed references to the deleted graphrag-leptos crate (workspace layout now 5-crate
    • the graphrag meta-crate, dependency graph updated); “Web UI” section rewritten around the chat-shell. HOW_IT_WORKS.md: the WASM section now points to graphrag-wasm (no longer to the deleted graphrag-leptos).
  • graphrag-wasm README rewritten: the old 5-tab DaisyUI UI is replaced by the documentation of the 3-column Nordic-Minimal chat-shell (LeftRail/Stage/RightRail), off-main-thread inference, citations, IndexedDB persistence; removed the dead links to the deleted satellite guides.
  • Internal links repointed: all links to the deleted docs (in README.md, HOW_IT_WORKS.md, graphrag-core/README.md) now point to HOW_IT_WORKS.md, config/JSON5_CONFIG_GUIDE.md, CHANGELOG.md, or docs.rs/graphrag-core.

Removed

  • Dead code: removed graphrag-server/src/main_axum_old.rs (~31KB, orphan file with no references, neither a bin-target nor a module).
  • Unused dependency: removed text_analysis = "0.3" from graphrag-core and from [workspace.dependencies] (detected with cargo machete, verified: no use in the code — the only match was the string "context_analysis"). The other cargo machete reports (getrandom, gline-rs, js-sys, web-sys, tower, text-splitter) are verified false positives (wasm/api feature-enablers or crates whose lib name differs from the package name, like gline-rsgliner) and kept.

Changed

graphrag-wasm chat-shell rewrite (Nordic-Minimal) (2026-05-17)

  • BREAKING: the 5-tab daisyUI UI (Build / Explore / Query / Hierarchy / Settings) is replaced by a single 3-column chat shell that mirrors the Chat discussion.html Nordic-Minimal mockup verbatim (palette, font stack Newsreader / Geist / Geist Mono, class names, citation/hover wiring).
    • New layout in graphrag-wasm/src/main.rs: LeftRail (brand + sources + Flat/Hierarchy toggle + Build button), Stage (head with active source, thread of Turns, composer), RightRail (subgraph SVG + pipeline rows + ministats + references). All real data: documents come from the existing IndexedDB signal, pipeline progress is driven by the existing BuildStatus/BuildStage, embeddings come from ONNX Runtime Web + tokenizer.json, retrieval from VectorIndex::search, answers from WebLLM (Phi-3-mini for synthesis, Qwen for extraction), citations are post-processed via parse_answer_with_cites and link to <button class="cite"><div class="ref-card"> through the reactive active_ref: Option<u32> signal — no inline JS.
    • New module graphrag-wasm/src/components/chat_shell.rs holds the data types (ChatTurn, RefCard, AnswerSegment, SubgraphData), the citation parser and the per-query build_subgraph builder that unions entities from the top-K retrieved chunks and feeds them through components::force_layout::ForceLayout (320×240 viewBox, 16-node / 21-edge cap matching the mockup density label).
    • Styling: graphrag-wasm/tailwind.css is now a flat Nordic-Minimal stylesheet (no @tailwind directives, no daisyUI); graphrag-wasm/index.html drops lucide CDN + MutationObserver and adds the Google-fonts preconnect block.
    • leptos-lucide-rs dependency removed from graphrag-wasm/Cargo.toml.
    • Legacy daisyUI components (components/{settings,hierarchy,ui_components,chat_component}.rs) remain on disk for reference but are no longer compiled — components/mod.rs only exports chat_shell + force_layout.
    • Parity test: graphrag-wasm/tests/playwright/chat_layout.sh drives playwright-cli: opens the mockup over python3 -m http.server and the WASM SPA on trunk serve, captures 1440×900 screenshots (tests/playwright/artifacts/{mockup,wasm}.png) and asserts 19 shared selectors (.app, .rail-left .doc-item, .stage-title, .bubble-q, .cite, .stages .pls, .graph-frame svg, .ref-card, .composer input, …). Current status: 19/19 pass.

Added

2026 best-practices pass (graphrag-core ↔ graphrag-wasm) (2026-05-16)

  • Off-main-thread inference (Stage 3b) for graphrag-wasm.

    • WebLLM: WebLLM::new and WebLLM::new_with_progress in graphrag-wasm/src/webllm.rs now auto-detect a pre-spawned window.webllmWorker and switch to CreateWebWorkerMLCEngine, keeping the same chat.completions.create surface (and chat_stream’s async-iterator) intact. Falls back to the main-thread engine if worker spawn fails. New sidecar graphrag-wasm/webllm-worker.js hosts WebWorkerMLCEngineHandler (15 LOC).
    • ONNX Runtime Web: ort.env.wasm.proxy = true + numThreads = 1 set immediately after ort.min.js loads in graphrag-wasm/index.html, so all InferenceSession.run calls execute in ORT’s dedicated worker.
    • Trade-off vs the plan’s gloo-worker route: no second wasm bundle, no Rust worker scaffolding, ~30 LOC swap. Verification (“main-thread blocked < 50 ms during inference”) met via the runtimes’ built-in workers.
  • Token-streaming UX in graphrag-wasm QueryTab. Replaced the blocking WebLLM::chat(...) call at graphrag-wasm/src/main.rs:1604 with chat_stream(...): tokens are now appended to the results signal incrementally as they arrive from the model, matching 2026 in-browser-LLM UX guidance. The pre-existing streaming API in graphrag-wasm/src/webllm.rs:334 was previously unused.

  • IndexedDB persistence for the document set. New graphrag-wasm/src/persist.rs wraps IndexedDBStore with open_store, save_document, delete_document, load_all_documents. The App component restores documents on first load; manual input, file upload, Symposium-demo load, and document-remove handlers all persist their mutations. Reloading the page now preserves the document set instead of resetting to empty.

  • WAI-ARIA tabs pattern in graphrag-wasm. All 5 tab panels are now mounted permanently inside a <main id="main-content"> landmark with hidden=move || active_tab.get() != Tab::X. Each tab button gained an id (tab-build, tab-explore, etc.) matching the panel’s aria-labelledby. This fixes Lighthouse aria-valid-attr-value and landmark-one-main audits, and preserves component state across tab switches.

  • SEO: added <meta name="description"> and <link rel="canonical"> plus <meta name="color-scheme" content="dark light"> to graphrag-wasm/index.html. External links in the footer gained rel="noopener noreferrer".

  • Downloaded MiniLM-L6-v2 ONNX model (87MB) to graphrag-wasm/models/minilm-l6.onnx for semantic query embeddings. Previously the directory was empty, causing fallback to hash-based embeddings which produced no meaningful search results.

Removed

Broken orphan example crates deleted (2026-05-16)

  • examples/web-app/ and examples/graphrag-leptos-demo/ both depended on the deleted graphrag-leptos crate (merged into graphrag-wasm in March 2025). They were excluded from the workspace so they did not block builds, but were misleading for newcomers. Functionality is fully covered by graphrag-wasm itself.
  • Dropped exclude = ["examples/web-app"] from root Cargo.toml.

graphrag_py Python bindings crate deleted (2026-05-16)

  • Removed graphrag_py/ directory and workspace member entry in root Cargo.toml.
  • Reason: legacy crate, pyo3 0.21 (out-of-date), last touched 4 commits ago before the KV-cache / GLiNER / contextual-enricher / persistence wave. API frozen pre-feb-2026, never published (publish = false), Development Status :: 4 - Beta.
  • BREAKING: Python bindings no longer build from this repo. Future Python support should live in a separate repo with current pyo3.

Changed

Clippy gate restored on wasm32-unknown-unknown target (2026-05-16)

cargo clippy --lib -p graphrag-core --no-default-features --features "wasm-bundle" --target wasm32-unknown-unknown -- -D warnings went from 54 errors → 0. Native default-features pass also restored to 0 errors. Both targets and the 363 native lib tests now pass cleanly under the PostToolUse clippy hook.

  • Mechanical lints auto-applied: sort_by_key (5×), clamp (5×), unwrap_or_default, is_some_and, manual_abs_diff, manual_pattern_char_comparison, collapsible_match, let_and_return, derivable_impls, field_reassign_with_default, needless_return.
  • Type aliases for boxed Fn benchmark callbacks in graphrag-core/src/monitoring/benchmark.rs:208-214: RetrievalFn, RerankerFn, LlmFn. Eliminates 3× type_complexity warnings.
  • HierarchicalLeidenResult type alias in graphrag-core/src/graph/leiden.rs:17 factored out the Result<(HashMap<.., HashMap<..>>, HashMap<..>)> return type of hierarchical_leiden.
  • Feature-gated dead-code under wasm: helper methods in gleaning_extractor.rs, llm_extractor.rs, chunking_strategies.rs, contextual_enricher.rs, late_chunking.rs are now #[cfg(feature = "async")]. Fields ollama_client (atomic_fact_extractor, llm_extractor), prompt_builder (llm_extractor), client (contextual_enricher), llm_extractor (gleaning_extractor), critic (graphrag/mod), api_key (late_chunking), and boundary_detector / coherence_scorer / min_chunk_chars (chunking_strategies) carry #[cfg_attr(not(feature = "async"), allow(dead_code))]. Five modules carry #![cfg_attr(not(feature = "async"), allow(unused_imports))] to silence imports that become dead when the async build_graph path is gone.
  • Restored imports lost during refactor: TextChunk, GraphRAGError, Document, HashMap, HashSet, Result, OllamaGenerationParams re-added to atomic_fact_extractor.rs, gleaning_extractor.rs, llm_extractor.rs, contextual_enricher.rs, late_chunking.rs. Underscored-but-still-used variables (_e → log-formatter args, _original_score, _total_chunks) rewritten to be self-consistent.

Fixed

WASM compilation broken after graphrag-core refactor (2026-05-16)

graphrag-core failed to compile for wasm32-unknown-unknown (65 errors → 0). The WASM build uses default-features = false (excludes async, tracing, tokio, parallel-processing), but many code paths used tracing:: calls and tokio without feature gates.

  • Added #[cfg(feature = "tracing")] gates to ~80 tracing:: calls across 15 files.
  • Gated tokio::runtime::Runtime in BoundaryAwareChunkingStrategy::chunk() behind #[cfg(feature = "async")] with sync fallback.
  • Split RetrievalSystem::batch_query() into #[cfg(feature = "parallel-processing")] and #[cfg(not(feature = "parallel-processing")) variants.
  • Fixed sync ask() (#[cfg(not(feature = "async"))) to call retrieval.query() instead of async query_internal().
  • Added #![recursion_limit = "512"] to graphrag-wasm main.rs for Leptos type depth.
  • Created missing graphrag-wasm/models/ directory required by Trunk.

Missing Relationship fields in sync build_graph() (2026-05-16)

graphrag-core/src/graphrag/build.rs:690: Relationship struct literal was missing embedding, temporal_type, temporal_range, and causal_strength fields added in Phase 1.2 (Advanced GraphRAG). Added all four with None defaults so the sync build path compiles without partial-init errors.

rograg::validator dropped quality metrics (2026-05-16)

graphrag-core/src/rograg/validator.rs:376: validate_response was computing coherence_score, relevance_score, factual_consistency_score, completeness_score, readability_score, and source_credibility_score then throwing them away (7 unused_variable / unused_assignments warnings). Now they:

  • Fold into validated_response.confidence via a new overall_quality() helper (mean of the metrics that were actually run — coherence / relevance / factual consistency are gated on their respective config flags; completeness / readability / source credibility always count).
  • Trigger a Medium IssueType::Quality validation issue when overall quality falls under 0.5.
  • Are emitted as a structured tracing::debug! event so the metrics are observable in logs without a public API change.

Changed

Server crate: color-eyre pretty errors at startup (2026-05-16)

  • graphrag-server/src/main.rs: main() return type std::io::Result<()>color_eyre::Result<()>, with color_eyre::install() at top.
  • Adds color-eyre = "0.6" to graphrag-server/Cargo.toml.
  • mimalloc allocator was already wired (no change).
  • Production unwraps in server crate audited: all 16 remaining unwraps are inside #[cfg(test)] blocks (qdrant_store, auth, embeddings, config_handler, etc.). Production paths use .map_err(...)? / .ok_or_else(...)? — already clean. Part of refactor-2026-05 server slice.

Documentation

Stale memory + CLAUDE.md notes refreshed (2026-05-16)

  • CLAUDE.md workspace layout: 6-crate → 5-crate (graphrag_py removed).
  • CLAUDE.md “Known gotchas”: replaced obsolete “12 failing unit tests” claim with verified status: cargo test -p graphrag-core --lib → 363 pass / 0 fail. The remaining cargo test --workspace failures come from stale examples (not tests) under graphrag-core/examples/ with missing Entity / Relationship fields; left untouched per project policy.
  • MEMORY.md (auto-memory) synced to the same wording.

Removed

Test suite aggressive pruning (2026-05-16)

User-requested clean-up: keep only indispensable, up-to-date tests; delete broken pre-existing failures, hanging tests, stale pre-refactor integration tests, and trivial construction-only sanity tests.

  • 23 broken / hanging / failing unit tests deleted:
    • async_graphrag::tests::* (6 tests on dead module)
    • entity::*::test_normalize_name (2 stale assertions)
    • entity::llm_relationship_extractor::test_fallback_extraction
    • reranking::cross_encoder::test_rerank_basic + test_confidence_filtering (need ONNX)
    • retrieval::symbolic_anchoring::test_extract_anchors (stale)
    • text::boundary_detection::test_sentence_detection + test_combined_detection
    • graph::incremental::tests::test_basic_entity_upsert + 6 ProductionGraphStore tests (deadlock in async lock contention — hung indefinitely)
    • rograg::logic_form::tests::test_pattern_parser + test_logic_form_retrieval
    • rograg::intent_classifier::tests::test_{factual,relational,temporal,causal,comparative,summary,definitional}_intent (7 stale assertions on intent classification)
    • rograg::quality_metrics::test_performance_stats_update
    • rograg::streaming::test_template_selection
    • incremental::lazy_propagation::test_lazy_propagation_basic
    • incremental::delta_computation::test_parallel_computation
  • 10 stale workspace-level integration test files deleted (./tests/*.rs, all pre-2026, predate the KV cache / GLiNER / persistence / file-split refactors): caching_integration.rs, config_integration_test.rs, http_endpoint_tests.rs, hybrid_retrieval_tests.rs, integration_tests.rs, modular_integration_tests.rs, property_tests.rs + .proptest-regressions, server_integration_tests.rs, zero_cost_approaches_integration_tests.rs, tests/parallel/. Plus graphrag-core/tests/ollama_enhancements.rs (didn’t compile — missing context field on OllamaGenerationParams).
  • 15 trivial test_*_creation patterns deleted (single-line constructions verifying only X::new().is_ok()): test_tree_creation, test_async_mock_llm_creation, test_incremental_pagerank_creation, test_processor_creation, test_agent_creation, test_function_caller_creation, test_cache_warmer_creation, test_retrieval_system_creation, test_enhanced_registry_creation, test_mock_llm_creation, test_answer_generator_creation, test_graphrag_creation, test_graph_indexer_creation, test_lancedb_creation, test_cached_client_creation. Plus 2 trivial Ollama adapter creation tests (entire test module in core/ollama_adapters.rs removed).
  • Tests retained: 7 integration test files in graphrag-core/tests/ (the 2026-02 refactor-era tests exercising KV cache, contextual enricher, GLiNER features, triple validation, dynamic weighting, BAR-RAG, text pipeline fixtures, incremental graph updates). ./tests/e2e/ benchmark scripts kept.
  • Verification matrix — all 100% green:
    • cargo test -p graphrag-core --lib363 passed, 0 failed (was 371/12 fail)
    • cargo test -p graphrag-core --lib --features rograg402 passed, 0 failed
    • cargo test -p graphrag-core --lib --features incremental390 passed, 0 failed

Fixed

Workspace-wide production unwrap() sweep (2026-05-16) — Part of refactor-2026-05 Phase 3 (extended)

  • Going beyond the original Phase 3 scope (voy_store, rograg/streaming, rograg/processor, cli/config, qdrant_store — all already verified test-only or previously cleaned), every remaining production .unwrap() in the workspace has been replaced with the appropriate safe alternative.
  • Mechanical sweeps by category:
    • 36 partial_cmp(...).unwrap() (f32 sort comparators, NaN-panic-prone) across ~23 files (async_graphrag, inference, retrieval/*, graph/*, summarization, vector, monitoring, nlp, generation, server handlers, etc.) → .unwrap_or(std::cmp::Ordering::Equal).
    • 22 lock()/read()/write().unwrap() (Mutex/RwLock acquisitions, poisoned-lock-panic-prone) → .expect("lock poisoned") / .expect("rwlock poisoned").
    • 12 Regex::new(...).unwrap() (static regex literals) → .expect("static regex literal").
    • duration_since(UNIX_EPOCH).unwrap() (system clock) → .expect("system clock before UNIX epoch").
    • Iterator and Option terminators (.first(), .last(), .next(), .min(), .max(), .pop(), .as_ref(), .as_mut(), .chars().next()) after checked-precondition usages → .expect(<reason>).
    • Targeted contextual fixes for result_map.remove, get_mut after contains_key, Self::new() in Default::default, NonZeroUsize::new on literal, caps.get(N), strip_prefix(...) after starts_with, etc.
  • Test-only infrastructure files (core/test_traits.rs, core/test_utils.rs) intentionally left untouched — their .unwrap() calls represent test-helper panic semantics by design (suite is called from test functions only).
  • Net result: workspace audit reports 0 production .unwrap() calls outside test infrastructure (down from ~178 pre-existing). All builds green: graphrag-core default + --features rograg + --features incremental, plus graphrag-cli, graphrag-server, graphrag wrapper.

Changed

Module split: retrieval/types.rs extracted (2026-05-16) — Part of refactor-2026-05 Phase 4 (final)

  • Extracted RetrievalConfig, SearchResult, ResultType, QueryAnalysis, QueryType, QueryIntent, QueryAnalysisResult, QueryResult, RetrievalStatistics (+ its print impl) from graphrag-core/src/retrieval/mod.rs into the new private module graphrag-core/src/retrieval/types.rs (199 LOC).
  • retrieval/mod.rs shrinks 1851 → 1666 LOC; the public API is preserved via pub use types::*; so crate::retrieval::SearchResult etc. resolve unchanged.
  • Restored one stripped doc comment (/// Statistics about the retrieval system) on RetrievalStatistics to satisfy #![warn(missing_docs)] — the sed extraction had eaten the line during slicing.
  • This was the last remaining Phase 4 item from the plan. Build + clippy clean (per the feedback-verify-with-build-clippy policy).

Sub-split: graphrag/ directory module (2026-05-16) — Part of refactor-2026-05 Phase 4

  • Follow-up to the earlier graphrag.rs single-file move. The 1753-LOC graphrag-core/src/graphrag.rs is now a directory module graphrag-core/src/graphrag/ with per-concern sub-files:
    • mod.rs (~105 LOC): struct GraphRAG, sub-module declarations, private ensure_initialized helper (bumped fnpub(super) fn so the sibling impl blocks can call it), #[cfg(test)] mod tests block with the two pre-existing tests.
    • lifecycle.rs (~189 LOC): new, default_local, builder, initialize, try_load_from_workspace, save_to_workspace, clear_graph.
    • documents.rs (~53 LOC): add_document_from_text, add_document.
    • build.rs (~715 LOC): async + sync build_graph paired methods.
    • ask.rs (~519 LOC, renamed from query.rs to avoid clash with use crate::query for the planner module): ask, ask_with_reasoning, ask_explained, query_internal, query_internal_with_results, generate_semantic_answer_from_results, remove_thinking_tags, ask_with_pagerank pair.
    • stats.rs (~85 LOC): config, is_initialized, has_documents, has_graph, knowledge_graph, knowledge_graph_mut, get_entity, get_entity_relationships, get_chunk.
    • factory.rs (~202 LOC): from_json5_file, from_config_file, from_config_and_document, quick_start, quick_start_with_config.
  • Each sub-file has its own impl GraphRAG { ... } block; Rust allows multiple impl blocks across files. All sub-files share an identical kitchen-sink import header (Config, core types, critic, ollama, persistence, query, retrieval, feature-gated parallel, plus use super::GraphRAG).
  • Public API preserved: graphrag_core::GraphRAG resolves via lib.rs’s pub use graphrag::GraphRAG; (unchanged from the single-file pass).
  • Verified per the new policy: cargo build -p graphrag-core + downstream crates green; cargo clippy -p graphrag-core -- -D warnings shows exactly one error in the new files (graphrag/ask.rs:408 clamp pattern) which is a verbatim carry-over from the previous graphrag.rs:1358 (originally lib.rs:1594) — net new errors: zero. Tests not re-run (pure file move; see feedback-verify-with-build-clippy memory entry).

God-file split: graph/incremental/ directory module (2026-05-16) — Part of refactor-2026-05 Phase 4

  • Converted graphrag-core/src/graph/incremental.rs (2905 LOC — the biggest god-file in the crate) into a directory module graphrag-core/src/graph/incremental/ with focused sub-files:
    • mod.rs (~395 LOC): doc + sub-module declarations + pub use re-exports + verbatim #[cfg(test)] mod tests block + the kitchen-sink use import block the tests rely on via super::*.
    • types.rs (~465 LOC): UpdateId, TransactionId, ChangeRecord, ChangeType, Operation, ChangeData, Document, GraphDelta, DeltaStatus, RollbackData, ConflictStrategy, Conflict, ConflictType, ConflictResolution, the IncrementalGraphStore trait, GraphStatistics, ConsistencyReport, InvalidationStrategy, CacheRegion.
    • helpers.rs (~496 LOC): SelectiveInvalidation, ConflictResolver, UpdateMonitor + impls + their satellite types (InvalidationStats, UpdateMetric, OperationLog, PerformanceStats).
    • manager.rs (~898 LOC): IncrementalGraphManager (both feature-gated and non-gated paired definitions kept adjacent), IncrementalConfig, IncrementalStatistics, IncrementalPageRank, BatchProcessor, PendingBatch, BatchMetrics, plus the impl GraphRAGError convenience constructors that conceptually belong here.
    • store.rs (~743 LOC): ProductionGraphStore + Transaction + TransactionStatus
      • IsolationLevel + ChangeEvent + ChangeEventType + impl IncrementalGraphStore for ProductionGraphStore + ChangeDataExt trait & impl.
  • Public API preserved via pub use cascade in mod.rs (crate::graph::incremental::* resolves unchanged).
  • Visibility-only bumps to keep the shared test module compiling across the new sub-module boundary:
    • IncrementalPageRank.scores: fieldpub(super) field
    • ConflictResolver.strategy: fieldpub(super) field
    • ConflictResolver::merge_entities: fnpub(super) fn
  • Verification strategy update (per user request): switched from cargo test --features incremental (which surfaces many pre-existing unrelated failures and obscures the signal we care about) to cargo build --features incremental + cargo clippy --features incremental -- -D warnings. The clippy run reports 34 errors, all in pre-existing files outside the split (graphrag.rs, retrieval/, text/, monitoring/, etc.); zero new errors in graph/incremental/. Downstream crates (graphrag-cli, graphrag-server, graphrag) build clean.

Module split: config/json_parser.rs extracted (2026-05-16) — Part of refactor-2026-05 Phase 4

  • Extracted Config::from_file (~553 LOC hand-rolled JSON reader using the json crate) and Config::to_file (~200 LOC writer) from graphrag-core/src/config/mod.rs into the new private module graphrag-core/src/config/json_parser.rs (769 LOC, with imports + impl Config { ... } wrapper).
  • config/mod.rs shrinks 2491 → 1737 LOC. Public API unchanged: both methods are still reachable as Config::from_file / Config::to_file via the new impl Config block (multiple impl blocks across files compile fine).
  • Distinct from config::json5_loader (serde-based typed JSON5 loader) and config::loader (multi-format dispatcher) — this is the bespoke json crate path.
  • 371 unit tests pass; 12 pre-existing failures unchanged.

God-file split: rograg/logic_form/ directory module (2026-05-16) — Part of refactor-2026-05 Phase 4

  • Converted graphrag-core/src/rograg/logic_form.rs (1517 LOC) into a directory module graphrag-core/src/rograg/logic_form/ with focused sub-files:
    • mod.rs (141 LOC): doc + sub-module declarations + pub use re-exports + verbatim #[cfg(test)] mod tests block.
    • types.rs (333 LOC): LogicFormError, LogicFormQuery, Predicate, Argument, ArgumentType, Constraint, ConstraintType, LogicQueryType, LogicFormResult, VariableBinding, LogicExecutionStats.
    • parser.rs (240 LOC): LogicFormParser trait + PatternBasedParser + LogicPattern + ArgumentExtractor + impls.
    • executor.rs (673 LOC): LogicFormExecutor + impls.
    • retriever.rs (217 LOC): LogicFormRetriever struct + Default + impl.
  • Public API preserved via pub use cascade through both logic_form/mod.rs and rograg/mod.rs (crate::rograg::LogicFormResult, crate::rograg::LogicFormRetriever, etc. still resolve unchanged).
  • Single non-mechanical change: bumped LogicFormExecutor::calculate_name_similarity from private fn to pub(super) fn — the existing test_name_similarity test in the shared tests module needs cross-submodule access. Visibility-only adjustment; no behavior or signature change.
  • Pre-existing test failures (test_logic_form_retrieval, test_pattern_parser) remain unchanged (verified by re-running them on main before the split).

God-file split: graphrag-core/src/graphrag.rs (2026-05-16) — Part of refactor-2026-05 Phase 4

  • Extracted the pub struct GraphRAG and its single impl GraphRAG { ... } block (constructors, lifecycle, build_graph, ask*, query_internal*, generate_semantic_answer_from_results, remove_thinking_tags, getters, factory methods, ensure_initialized, tests) from graphrag-core/src/lib.rs into the new private module file graphrag-core/src/graphrag.rs.
  • lib.rs is now a 263-LOC re-export shell (mod graphrag; pub use graphrag::GraphRAG;). graphrag.rs is 1753 LOC (header + verbatim impl + moved #[cfg(test)] mod tests).
  • Public API is preserved: graphrag_core::GraphRAG and graphrag_core::prelude::GraphRAG resolve through the new re-export with identical paths.
  • Added module-scoped imports at the top of graphrag.rs (Config, core types, critic, ollama, persistence, query, retrieval, feature-gated parallel) so the impl body compiles verbatim without inline path changes.
  • Both moved tests (test_graphrag_creation, test_builder_pattern) still pass. All other pre-existing test/doc failures remain unchanged (12 unit tests, 7 doctests).
  • Sub-splitting the impl across graphrag/{lifecycle,documents,build,query,stats}.rs remains deferred to a follow-up — single-file move first per plan.

Module split: retrieval/explained.rs (2026-05-16) — Part of refactor-2026-05 Phase 4

  • Extracted ExplainedAnswer, SourceReference, SourceType, ReasoningStep (and the ~160 LOC impl ExplainedAnswer block with from_results + format_display) from graphrag-core/src/retrieval/mod.rs into new graphrag-core/src/retrieval/explained.rs.
  • Public API preserved via pub use explained::* in retrieval/mod.rs — downstream callers see no change.
  • Net effect: retrieval/mod.rs shrinks from 2094 LOC → 1851 LOC; new explained.rs is 250 LOC.
  • Replaced legacy .min(1.0).max(0.0) with idiomatic .clamp(0.0, 1.0) in the moved from_results fn (clippy manual_clamp).
  • Larger god-file splits (lib.rs 1968 LOC, logic_form.rs 1517, incremental.rs 2905, config/mod.rs JSON loader) remain deferred — see plan file.

Fixed

Production unwrap removal (2026-05-16) — Part of refactor-2026-05 Phase 3

  • rograg/streaming.rs: regex unwrap()expect("static regex literal"); three partial_cmp(...).unwrap() calls on f32 confidence scores now use unwrap_or(Ordering::Equal) to avoid panics on NaN.
  • rograg/processor.rs::RogragProcessorBuilder::build: replaced inner .unwrap() on HybridQueryDecomposer::new() and IntentClassifier::new() with ? propagation; SystemTime::duration_since(UNIX_EPOCH).unwrap().expect("system clock before UNIX epoch") (genuine programmer-bug case).
  • graphrag-server/src/qdrant_store.rs: removed 6 production .unwrap() calls in add_document, add_documents_batch, and search — payload .as_object(), serde_json::to_value, serde_json::from_value, and point.id now propagate QdrantError via ? and Result::collect.
  • Tests-only unwrap() in vector/voy_store.rs and graphrag-cli/src/config.rs left intact (per Phase 3 scope: production paths only).

Added - GLiNER-Relex Extraction via gline-rs (2026-02-23)

GLiNER-Relex Entity + Relation Extractor (entity/gliner_extractor.rs, config/mod.rs, config/setconfig.rs, lib.rs)

  • New GLiNERExtractor: joint entity + relation extraction in a single forward pass via gline-rs v1.0.1 + ONNX Runtime. ~1.5 GB VRAM vs 8+ GB for generative LLMs; zero structural hallucinations.
  • Two-stage pipeline: NER (SpanPipeline or TokenPipeline) → RE (RelationPipeline), both composed on the same orp::model::Model with lazy loading via Arc<RwLock<Option<Model>>>.
  • Confidence scores propagated natively into Entity.confidence and Relationship.confidence.
  • Optional feature flag gliner: crate compiles and works normally without it.
  • tokio::task::spawn_blocking wrapper in lib.rs keeps the async runtime unblocked.
  • Config example (JSON5):
    gliner: {
      enabled: true,
      model_path: "./models/gliner-relex-large-v0.5.onnx",
      entity_labels: ["person", "organization", "location"],
      relation_labels: ["controls", "located in", "causes"],
      entity_threshold: 0.40,
      relation_threshold: 0.50,
      mode: "span",   // or "token" for gliner-multitask
      use_gpu: false,
    }
    

Added - Graph Persistence / Storage Choice (2026-02-23)

Storage Backend — In-Memory vs Disk (config/mod.rs, config/setconfig.rs, lib.rs)

  • AutoSaveConfig (and AutoSaveSetConfig in SetConfig) now expose:
    • base_dir: Option<String> — directory where workspace folders are stored (e.g. "./output")
    • workspace_name: Option<String> — sub-folder inside base_dir (default: "default")
    • enabled: boolfalse (default) = in-memory only; true = persist to disk
  • GraphRAG::initialize() now calls try_load_from_workspace(): if auto_save.enabled = true and the workspace already exists on disk, the graph is loaded from disk instead of starting empty. The second run reuses the previously built graph automatically.
  • GraphRAG::save_to_workspace() — new public method; also called automatically at the end of build_graph() when persistence is enabled.
  • No-op when enabled = false; zero performance cost for in-memory-only deployments.
  • Format hierarchy on disk: Parquet (if persistent-storage feature) → JSON fallback (always).
  • JSON5 config usage:
    auto_save: {
      enabled: true,
      base_dir: "./output",
      workspace_name: "my_project",
    }
    

Fixed - Extraction Temperature (2026-02-23)

Zero-Temperature Entity Extraction (entity/gleaning_extractor.rs, entity/llm_extractor.rs, config/setconfig.rs)

  • GleaningConfig::default() and LLMEntityExtractor::new() now use temperature: 0.0 (was 0.1)
    • Fully deterministic JSON output — eliminates spurious token variation that causes parse failures
    • Consistent with recommendations for structured extraction models (NuExtract, Triplex, etc.)
  • EntityExtractionConfig.temperature in SetConfig now defaults via default_extraction_temperature() = 0.0
    • Separate from default_temperature() = 0.1 used for general LLM parameters
    • Users can override in JSON5: entity_extraction.temperature = 0.0
  • ContextualEnricher retains 0.1 (generates natural language descriptions, not strict JSON)

Fixed & Improved - Entity Extraction, Query Quality & Sources (2026-02-23)

SetConfig use_gleaning Bug Fix (config/setconfig.rs)

  • Bug: when mode.approach = "semantic" with no semantic: sub-section, the else block hardcoded config.entities.use_gleaning = true regardless of the top-level entity_extraction.use_gleaning field
  • Fix: the else block now reads from self.entity_extraction.use_gleaning and max_gleaning_rounds directly
  • This affected ALL JSON5 configs using mode.approach = "semantic" without an explicit semantic: block

LLM Single-Pass Entity Extraction (lib.rs, entity/llm_extractor.rs, ollama/mod.rs)

  • New LLM single-pass path in lib.rs: ollama.enabled && !use_gleaning now uses LLMEntityExtractor instead of falling through to pattern-based regex extraction
  • Dynamic num_ctx per chunk: (prompt_tokens + max_output_tokens) × 1.20, rounded to 1024, clamped [4096, 131072] — mirrors the ContextualEnricher formula
  • LLMEntityExtractor now carries keep_alive: Option<String> and with_keep_alive() builder
  • call_llm_with_retry and call_llm_completion_check use generate_with_params instead of generate() to pass num_ctx and keep_alive — activates Ollama KV cache during entity extraction
  • GleaningEntityExtractor::new extracts keep_alive before consuming the client and threads it through
  • OllamaClient::config() getter added for field access without moving
  • Result on Symposium (274 chunks, mistral-nemo, no gleaning): 1,139 entities, 670 relationships (vs 0 relationships previously due to pattern-based fallback)

JSON Parse Resilience — Missing description Field (entity/prompts.rs)

  • EntityData.description is now annotated #[serde(default)]
  • When the LLM returns JSON with a missing description field (e.g. for Project Gutenberg license chunks), parsing succeeds with an empty string instead of falling through to the error path and losing all entities from that chunk
  • Fixes the "JSON repair failed: missing field 'description'" errors seen in the last ~10 chunks of Project Gutenberg books

Multi-Chunk Semantic Answer Generation (lib.rs, handlers/bench.rs)

  • generate_semantic_answer_from_results: reworked context assembly
    • Removed 400-char truncation: full chunk content is now passed to the LLM for each result
    • Deduplication: tracks seen chunk IDs to avoid repeating the same chunk from multiple entity hits
    • Relevance sorting: context sections sorted by score descending before joining
    • Synthesis prompt: updated instructions to ask the LLM to synthesize across ALL context sections
    • Dynamic num_ctx: prompt size calculated at runtime with 20% margin — activates KV cache for answering
    • generate_with_params used instead of generate() — passes num_ctx, keep_alive, temperature
  • bench.rs: switched from graphrag.ask() to graphrag.ask_explained()
    • sources in the JSON output now populated with actual chunk IDs and excerpts (was always [])

E2E Config — No-Gleaning Mistral Pipeline

  • New config tests/e2e/configs/kv_no_gleaning_mistral__symposium.json5
    • use_gleaning: false, keep_alive: "1h", chunk_size: 1000, chunk_overlap: 200
    • Uses mistral-nemo:latest for entity extraction and nomic-embed-text for embeddings

Added - Ollama KV Cache & Contextual Retrieval (2026-02-22)

Ollama KV Cache Parameters (ollama/mod.rs, config/mod.rs, config/setconfig.rs)

  • keep_alive field added to OllamaConfig and OllamaGenerationParams
    • Keeps the Ollama model loaded in VRAM between requests (prevents KV cache eviction)
    • Critical for multi-chunk document processing: without it, the model unloads between each chunk
    • Default: None (uses Ollama’s built-in 5-minute default)
    • Example: "1h" for book-length document processing sessions
  • num_ctx field added to OllamaConfig and OllamaGenerationParams
    • Explicitly sets the context window size (Ollama silently truncates to 2k-8k without this)
    • Goes into the options object in Ollama API requests; keep_alive is a top-level field
    • Default: None (uses Ollama’s default, usually 2048-8192 tokens)
    • Example: 32768 for documents up to ~130k characters
  • Both fields wired through the full config stack: JSON5 parser, OllamaSetConfig, request body

Contextual Chunk Enricher (text/contextual_enricher.rs)

  • New module implementing Anthropic’s Contextual Retrieval pattern
  • ContextualEnricher: augments each chunk with 2-3 sentences of document-level context before embedding
  • KV Cache optimization: static prefix (full document) is cached by Ollama; only the chunk suffix is re-evaluated per request
    • First chunk: ~2 min (loads document into KV cache on RTX 4070 with Mistral-NeMo 12B)
    • Subsequent chunks: ~3-5 sec each (only chunk tokens evaluated)
    • ~100 chunks from a 45k-token book: 5-10 minutes total vs hours without KV cache
  • calculate_num_ctx(): dynamic context window calculation per document
    • Formula: tokens(instructions) + tokens(document) + tokens(largest_chunk) + output_budget + 5% margin
    • Rounded to nearest 1024, clamped to [4096, 131072]
  • enrich_document_chunks() and enrich_chunks(): async, groups chunks by source document
  • Output format: [LLM context]\n\n[original chunk text] — preserves original text verbatim

Late Chunking Strategy (text/late_chunking.rs)

  • New LateChunkingStrategy implementing ChunkingStrategy trait (Jina AI technique)
  • Produces chunks annotated with position_in_document metadata (byte spans) for post-hoc pooling
  • JinaLateChunkingClient: calls Jina Embeddings API v2 with late_chunking: true
  • split_into_sections(): handles documents exceeding model context window (8192 tokens for Jina v3)
  • LateChunkingConfig: configurable chunk size, overlap, max document tokens, position annotation

E2E Benchmark KV Cache Support (tests/e2e/run_benchmarks.sh)

  • Three new pipeline dimensions: keep_alive, num_ctx, ollama_timeout
  • All existing pipelines updated with explicit defaults (keep_alive=none, num_ctx=0)
  • Semantic/hybrid pipelines with Ollama now default to keep_alive=30m (model stays loaded during build phase)
  • Three new KV cache pipelines targeting long document processing:
    • kv_semantic_mistral: semantic approach, Mistral-NeMo, keep_alive=1h, num_ctx=32768, timeout=300s
    • kv_hybrid_mistral: hybrid approach, Mistral-NeMo, keep_alive=1h, num_ctx=32768, timeout=300s
    • kv_semantic_qwen3: semantic approach, Qwen3 8B Q4, keep_alive=1h, num_ctx=16384, timeout=300s
  • KV Cache settings shown in run header when active
  • Generated JSON5 configs include keep_alive and num_ctx in the ollama section

Tests

  • tests/contextual_enricher_e2e.rs: 4 tests for ContextualEnricher
    • test_enriched_chunk_contains_original_and_context (#[ignore], requires ENABLE_OLLAMA_TESTS=1)
    • test_kv_cache_speedup (#[ignore]) — measures per-chunk timing and speedup ratio
    • test_num_ctx_calculation_sanity — always-run, validates num_ctx formula bounds
    • test_disabled_enricher_returns_chunks_unchanged — always-run no-op safety check

Added - Service Registry Completion (2025-02-11)

Core Infrastructure

  • Complete test utilities module (core/test_utils.rs):
    • MockEmbedder: Deterministic hash-based embedding generation with dimension support
    • MockLanguageModel: Configurable response mapping for testing
    • MockVectorStore: In-memory vector store with cosine similarity search
    • MockRetriever: Simple retriever for testing search pipelines
    • All mocks fully implement core Async* traits
    • 100% test coverage with 5 passing test cases

Adapter Implementations

  • Entity extraction adapter (core/entity_adapters.rs):

    • GraphIndexerAdapter bridges LightRAG’s GraphIndexer to AsyncEntityExtractor trait
    • Configurable confidence threshold filtering
    • Entity type conversion from domain-specific to core types
    • Batch extraction support
    • Feature-gated with lightrag feature
  • Retrieval system adapter (core/retrieval_adapters.rs):

    • RetrievalSystemAdapter implements AsyncRetriever trait
    • Integration with KnowledgeGraph-based retrieval
    • Batch search support
    • Comprehensive documentation on graph requirements
    • Feature-gated with basic-retrieval feature
  • Metrics collector implementation (monitoring/metrics_collector.rs):

    • Thread-safe metrics with DashMap for counters, gauges, and histograms
    • Atomic operations for zero-lock contention
    • Histogram statistics: count, sum, mean, min, max, p50, p95, p99
    • Timer support with start/finish API
    • Metric tagging with key-value pairs
    • 7/7 passing tests for all metric types
    • Feature-gated with dashmap and monitoring features

Registry Integration

  • Service registration in ServiceConfig::build_registry():
    • Entity extractor registration (with lightrag feature)
    • Retriever registration (with basic-retrieval feature)
    • Metrics collector registration (with dashmap + monitoring features)
    • Mock services for testing via with_test_defaults()
    • Proper feature-gating for modular compilation

Documentation

  • Architectural documentation:

    • Documented trait hierarchy for vector stores (domain-specific vs generic)
    • Explained when to use adapters vs direct implementations
    • Clarified graph integration requirements for retrieval
    • Added TODO markers for future unification work
    • Inline examples in all adapter modules
  • Code quality improvements:

    • Removed unused imports across multiple modules
    • Fixed parameter name warnings in data import
    • Commented out incomplete vector-memory feature gate
    • Clean compilation with async,ollama,dashmap,monitoring,basic-retrieval,lightrag features

Testing

  • 310 tests passing in graphrag-core library
  • All new service implementations verified:
    • test_mock_embedder: Hash-based deterministic embeddings
    • test_mock_language_model: Response mapping
    • test_mock_vector_store: Cosine similarity search
    • test_mock_retriever: Basic search operations
    • Metrics collector tests: counters, gauges, histograms, timers
  • Integration tests for service registration and retrieval

Added - Ollama Advanced Integration (2025-02-11)

Streaming Support

  • Real-time token generation with tokio channel-based streaming
  • generate_streaming() method returns tokio::sync::mpsc::Receiver<String>
  • Server-Sent Events (SSE) parsing for Ollama streaming API
  • Background task spawning for non-blocking stream reads
  • Automatic statistics recording for streamed responses
  • Example usage in test suite (tests/ollama_enhancements.rs)

Custom Generation Parameters

  • OllamaGenerationParams struct for fine-grained control:
    • num_predict: Maximum tokens to generate
    • temperature: Sampling temperature (0.0 - 1.0)
    • top_p: Nucleus sampling threshold
    • top_k: Top-k sampling
    • stop: Stop sequences (array of strings)
    • repeat_penalty: Repetition control
  • generate_with_params() method for custom parameter usage
  • Integration with AsyncLanguageModel trait’s complete_with_params()
  • Automatic conversion between core and Ollama parameter formats

Model Response Caching

  • DashMap-based caching for thread-safe concurrent access
  • Automatic cache population on API responses
  • Cache hit detection before making API calls
  • Performance: <1ms for cache hits vs 100-1000ms for API calls
  • Cache management API:
    • clear_cache(): Clear all cached responses
    • cache_size(): Get number of cached items
  • Configurable via OllamaConfig.enable_caching (default: true)
  • 80%+ hit rate on repeated queries
  • 6x cost reduction potential

Metrics & Usage Tracking

  • OllamaUsageStats struct with atomic counters:
    • total_requests: Total number of API calls
    • successful_requests: Successful completions
    • failed_requests: Failed attempts
    • total_tokens: Cumulative token count (estimated)
  • Thread-safe atomic operations (Arc<AtomicU64>)
  • Zero lock contention for metrics updates
  • API methods:
    • record_success(tokens): Record successful request
    • record_failure(): Record failed request
    • get_success_rate(): Calculate success percentage (0.0 - 1.0)
  • Integration with AsyncLanguageModel::get_usage_stats()
  • Automatic token estimation (~4 characters per token)

Service Registry Integration

  • Type-safe service injection for Ollama services
  • OllamaEmbedderAdapter implements AsyncEmbedder trait
  • OllamaLanguageModelAdapter implements AsyncLanguageModel trait
  • Automatic registration in ServiceConfig::build_registry()
  • Support for both embeddings and language model services
  • MemoryVectorStore registration for in-memory operations

Documentation

  • Complete OLLAMA_INTEGRATION.md guide with:
    • Setup and prerequisites
    • Basic and advanced usage examples
    • Supported models (embeddings and LLM)
    • Configuration options reference
    • Batch processing examples
    • Custom parameter examples
    • Performance tips and troubleshooting
  • Updated graphrag-core/README.md with new features
  • Updated main README.md with Ollama integration section
  • API reference with code examples
  • Sources and external documentation links

Testing

  • 8 new test cases in tests/ollama_enhancements.rs:
    • Config with caching test
    • Custom generation parameters test
    • Client statistics API test
    • Stats recording test
    • Cache management test
    • Default parameters test
    • Adapter integration tests
  • All tests passing (13/13 total including registry tests)
  • Compilation verified with all feature combinations

Configuration Updates

  • Added enable_caching: bool to OllamaConfig
  • Updated all OllamaConfig initializers across codebase:
    • config/mod.rs: TOML parsing
    • config/setconfig.rs: Config mapping
    • entity/llm_relationship_extractor.rs: LLM extraction
  • Default caching: enabled (true)

Changed

  • Model info updated: supports_streaming now returns true
  • AsyncLanguageModel implementation: Now uses generate_with_params() internally
  • OllamaClient structure: Added stats and cache fields
  • Error handling: Improved with metrics recording on failures
  • Test count: Increased from 214+ to 220+ test cases

Fixed

  • Missing enable_caching field in OllamaConfig initializers
  • Incorrect ModelUsageStats field mapping in adapter
  • Iterator reference error in execute_caused_query
  • Compilation warnings for unused imports

[0.1.1] - Previous Release

Added - Core GraphRAG Implementation

  • Temporal and causal reasoning for RoGRAG
  • Graph indexer with 23 relationship patterns
  • Service registry pattern for dependency injection
  • GraphRAGBuilder with fluent API
  • Parquet persistence for entities, relationships, documents
  • Memory vector store implementation
  • Complete trait-based architecture

Added - Research Features

  • LightRAG dual-level retrieval (6000x token reduction)
  • Leiden community detection (+15% modularity)
  • Cross-encoder reranking (+20% accuracy)
  • HippoRAG personalized PageRank (10-30x cost reduction)
  • Semantic chunking with better boundaries

Added - Infrastructure

  • Comprehensive test suite (214+ tests)
  • Production-grade logging with tracing
  • Feature flags for modular compilation
  • WASM support with WebGPU acceleration
  • Docker Compose deployment

[0.1.0] - Initial Release

Added

  • Basic GraphRAG pipeline
  • Entity and relationship extraction
  • Vector embeddings support
  • Graph construction and querying
  • REST API server
  • CLI tools

Migration Guides

Upgrading to Ollama Advanced Features

If you’re using basic Ollama integration, upgrading to the new features is seamless:

Before (still works):

#![allow(unused)]
fn main() {
let client = OllamaClient::new(OllamaConfig::default());
let response = client.generate("Hello").await?;
}

After (with new features):

#![allow(unused)]
fn main() {
let config = OllamaConfig {
    enable_caching: true,  // NEW: Enable caching
    ..Default::default()
};
let client = OllamaClient::new(config);

// Streaming
let mut rx = client.generate_streaming("Hello").await?;
while let Some(token) = rx.recv().await {
    print!("{}", token);
}

// Custom parameters
let params = OllamaGenerationParams {
    temperature: Some(0.8),
    top_p: Some(0.95),
    ..Default::default()
};
let response = client.generate_with_params("Hello", params).await?;

// Metrics
let stats = client.get_stats();
println!("Success rate: {:.2}%", stats.get_success_rate() * 100.0);
}

Breaking Changes

None! All new features are opt-in and backward compatible.


Development

Building from Source

git clone https://github.com/your-username/graphrag-rs.git
cd graphrag-rs
cargo build --release --features async,ollama,dashmap

Running Tests

cargo test --all-features
cargo test -p graphrag-core --test ollama_enhancements

Contributing

See CONTRIBUTING.md for guidelines.


For complete documentation, see: