Production-Ready RAG with Structure-Aware Reasoning, supporting pixel-precise citations and knowledge graph multi-hop reasoning.
ForgeRAG is an end-to-end RAG system open-sourced by the deeplethe organization, focused on structure-aware reasoning and production-grade usability. The backend is built on Python (FastAPI) with a Vue 3 frontend, released under the MIT license.
Core Architecture#
The system employs a dual-reasoning retrieval mechanism: BM25 and vector retrieval serve as pre-filtering, followed by LLM tree navigation and knowledge graph deep reasoning, with final ranking via RRF (Reciprocal Rank Fusion). The knowledge graph supports three levels of retrieval: local neighborhood traversal, global cross-lingual keyword matching, and relational semantic search. It achieved a 55.48% overall win rate against LightRAG on the UltraDomain benchmark, covering agriculture, CS, law, and mixed domains.
Data Flow#
- Ingestion: Document upload → PDF parsing (pymupdf/minerU/minerU-vlm) → chunking + LLM tree structure construction → entity/relation extraction → vectorization → distributed persistence across relational DB, vector store, graph store, and blob storage.
- Retrieval: User query → BM25 + vector pre-filtering → LLM tree navigation + KG dual-level retrieval → RRF fusion ranking.
- Answering: Fused context + KG-synthesized entity/relation descriptions → LLM generates answers with
[c_N]precise citation markers.
Key Features#
- Pixel-precise citations: Each claim carries a citation marker; clicking navigates to the exact source document page with highlighted bounding boxes, suitable for compliance/legal scenarios.
- Full retrieval tracing: End-to-end inspection of path scores, expansion decisions, and merge logic.
- Multi-format document ingestion: Native support for PDF, DOCX, PPTX, XLSX, HTML, Markdown, TXT. PDF parsing supports fast mode (pymupdf), layout-aware mode (mineru), and VLM mode (mineru-vlm for scanned documents).
- Multi-turn conversation: Continuous follow-up questions based on conversational context.
- YAML-first configuration: A single
forgerag.yamlcontrols all behavior with no hidden runtime state. - Per-request overrides: Dynamic switching of retrieval paths, top-k, and reranking strategies via
QueryOverrides, suitable for SDK integration and A/B testing.
Use Cases#
- Enterprise-grade multi-format knowledge base QA
- Compliance/legal scenarios requiring page-level bounding-box traceability
- Cross-document supply-chain multi-hop correlation analysis (e.g., "Which Apple suppliers also supply Samsung?")
- RAG architecture baseline evaluation tool (built-in benchmark module)
Deployment & Engineering#
Supports both local development and Docker deployment, both with interactive configuration wizards. Multi-worker horizontal scaling is supported. Prerequisites: Python 3.10+, Node.js 18+ (frontend build only), LLM API Key (LiteLLM-compatible), recommended 4+ CPU cores and 8GB+ RAM (16GB+ for large documents with KG extraction).
Backend Compatibility#
- Relational databases: SQLite (default), PostgreSQL, MySQL
- Vector stores: ChromaDB (default), pgvector, Qdrant, Milvus, Weaviate
- Graph stores: NetworkX in-memory (default), Neo4j
- Blob storage: Local filesystem (default), Amazon S3, Alibaba Cloud OSS
- LLM / Embeddings: Full compatibility with all LiteLLM-supported providers (OpenAI, Azure, Anthropic, Ollama, DeepSeek, Cohere, etc.)
Key API Endpoints#
| Endpoint | Description |
|---|---|
POST /api/v1/query | Query (SSE streaming or sync), supports path_filter + overrides |
POST /api/v1/documents/upload-and-ingest | Upload and ingest documents |
GET /api/v1/documents | List documents |
GET /api/v1/documents/{id}/tree | Get document hierarchy |
GET /api/v1/graph | Knowledge graph visualization |
GET /api/v1/settings | Read-only configuration snapshot |
Interactive docs: Swagger UI at /docs, ReDoc at /redoc.
Unconfirmed Information#
- Background of the
deepletheorganization and DeepLethe Team is not detailed in the repository - No independent academic paper link provided
- Roadmap has no specific timelines
- No public production use cases or customer references provided
Released under the MIT license, Copyright (c) 2025 DeepLethe Team and contributors.