ForgeRAG

Production-Ready RAG with Structure-Aware Reasoning, supporting pixel-precise citations and knowledge graph multi-hop reasoning.

ForgeRAG is an end-to-end RAG system open-sourced by the deeplethe organization, focused on structure-aware reasoning and production-grade usability. The backend is built on Python (FastAPI) with a Vue 3 frontend, released under the MIT license.

Core Architecture#

The system employs a dual-reasoning retrieval mechanism: BM25 and vector retrieval serve as pre-filtering, followed by LLM tree navigation and knowledge graph deep reasoning, with final ranking via RRF (Reciprocal Rank Fusion). The knowledge graph supports three levels of retrieval: local neighborhood traversal, global cross-lingual keyword matching, and relational semantic search. It achieved a 55.48% overall win rate against LightRAG on the UltraDomain benchmark, covering agriculture, CS, law, and mixed domains.

Data Flow#

Ingestion: Document upload → PDF parsing (pymupdf/minerU/minerU-vlm) → chunking + LLM tree structure construction → entity/relation extraction → vectorization → distributed persistence across relational DB, vector store, graph store, and blob storage.
Retrieval: User query → BM25 + vector pre-filtering → LLM tree navigation + KG dual-level retrieval → RRF fusion ranking.
Answering: Fused context + KG-synthesized entity/relation descriptions → LLM generates answers with [c_N] precise citation markers.

Key Features#

Pixel-precise citations: Each claim carries a citation marker; clicking navigates to the exact source document page with highlighted bounding boxes, suitable for compliance/legal scenarios.
Full retrieval tracing: End-to-end inspection of path scores, expansion decisions, and merge logic.
Multi-format document ingestion: Native support for PDF, DOCX, PPTX, XLSX, HTML, Markdown, TXT. PDF parsing supports fast mode (pymupdf), layout-aware mode (mineru), and VLM mode (mineru-vlm for scanned documents).
Multi-turn conversation: Continuous follow-up questions based on conversational context.
YAML-first configuration: A single forgerag.yaml controls all behavior with no hidden runtime state.
Per-request overrides: Dynamic switching of retrieval paths, top-k, and reranking strategies via QueryOverrides, suitable for SDK integration and A/B testing.

Use Cases#

Enterprise-grade multi-format knowledge base QA
Compliance/legal scenarios requiring page-level bounding-box traceability
Cross-document supply-chain multi-hop correlation analysis (e.g., "Which Apple suppliers also supply Samsung?")
RAG architecture baseline evaluation tool (built-in benchmark module)

Deployment & Engineering#

Supports both local development and Docker deployment, both with interactive configuration wizards. Multi-worker horizontal scaling is supported. Prerequisites: Python 3.10+, Node.js 18+ (frontend build only), LLM API Key (LiteLLM-compatible), recommended 4+ CPU cores and 8GB+ RAM (16GB+ for large documents with KG extraction).

Backend Compatibility#

Relational databases: SQLite (default), PostgreSQL, MySQL
Vector stores: ChromaDB (default), pgvector, Qdrant, Milvus, Weaviate
Graph stores: NetworkX in-memory (default), Neo4j
Blob storage: Local filesystem (default), Amazon S3, Alibaba Cloud OSS
LLM / Embeddings: Full compatibility with all LiteLLM-supported providers (OpenAI, Azure, Anthropic, Ollama, DeepSeek, Cohere, etc.)

Key API Endpoints#

Endpoint	Description
`POST /api/v1/query`	Query (SSE streaming or sync), supports `path_filter` + `overrides`
`POST /api/v1/documents/upload-and-ingest`	Upload and ingest documents
`GET /api/v1/documents`	List documents
`GET /api/v1/documents/{id}/tree`	Get document hierarchy
`GET /api/v1/graph`	Knowledge graph visualization
`GET /api/v1/settings`	Read-only configuration snapshot

Interactive docs: Swagger UI at /docs, ReDoc at /redoc.

Unconfirmed Information#

Background of the deeplethe organization and DeepLethe Team is not detailed in the repository
No independent academic paper link provided
Roadmap has no specific timelines
No public production use cases or customer references provided

Core Architecture#

Data Flow#

Key Features#

Use Cases#

Deployment & Engineering#

Backend Compatibility#

Key API Endpoints#

Unconfirmed Information#

Related Projects

Genkit

Gobii Platform

Semble

STAY UPDATED