A local-first knowledge compiler for AI agents, built on the LLM Wiki pattern, compiling raw materials into persistent markdown wikis, knowledge graphs, and hybrid search engines.
Core Positioning#
SwarmVault addresses the lack of persistent, traceable knowledge bases for AI agents in long-cycle, multi-source research tasks. Inspired by Andrej Karpathy's LLM Wiki gist and Vannevar Bush's Memex (1945).
Three-Layer Architecture (Karpathy Pattern)#
- raw/ — Immutable raw materials, supporting 30+ formats (code, PDF, audio/video, SRT, URLs, datasets, email exports, calendars, etc.)
- wiki/ — LLM-generated + human-authored persistent Markdown Wiki
- swarmvault.schema.md — Co-evolvable vault structure and domain conventions
Knowledge Graph#
Typed nodes (sources / concepts / entities / code modules) with provenance-tagged edges (extracted / inferred / ambiguous). Built-in interactive graph viewer (auto-switches to overview mode for large graphs). Supports Neo4j export. Graph stored locally as JSON in state/graph.json.
Hybrid Search#
SQLite full-text search (FTS) + semantic vector embeddings + optional rerank. Gracefully degrades to SQLite FTS + heuristic when no embedding provider is available.
Approval & Conflict Detection#
- Cross-source conflicts auto-tagged;
lint --conflictsfor on-demand auditing compile --approvestages changes as reviewable approval bundles; new concepts enterwiki/candidates/first
Agent Integration#
- MCP Server: Compatible with Claude Code, Codex, OpenCode, OpenClaw
- Context Packs: Cited handoff documents within token budgets (
context build) - Task Ledger:
task start/update/finish/resumefor persistent local task history
Code Awareness#
tree-sitter AST parsing; scan ./your-repo for one-click repository scanning to generate knowledge graphs and searchable wikis.
Collaboration & Sharing#
- Git-backed workflows (Watch mode + Lefthook Git Hooks)
- Share Kit: Publishable
share-card.md,share-card.svg, self-contained HTML previews, andshare-kit/package - Optional Obsidian integration (
--obsidianinit; graph exportable to Obsidian vault)
Runtime Modes#
- Fully offline: Built-in heuristic provider, zero API keys
- Local LLM acceleration: Recommended Ollama + Gemma 4 + nomic-embed-text
- Cloud APIs: OpenAI, Anthropic, Gemini, OpenRouter, Groq, Together, xAI, Cerebras, and any OpenAI-compatible endpoint
- Desktop app: macOS / Windows / Linux with built-in runtime, no separate Node.js installation needed
Operations Tools#
Vault Doctor provides graph health checks, retrieval status, review queues, migration, managed sources, task status with priority sorting and auto-repair (doctor --repair).
Architecture Overview#
pnpm monorepo with @swarmvaultai/cli (CLI entry), @swarmvaultai/engine (core compilation engine), @swarmvaultai/viewer (graph visualization frontend). Build toolchain: pnpm + TypeScript + Biome. Testing: Node.js built-in test runner + Playwright (E2E). Data storage: local filesystem (Markdown + JSON) + SQLite.
Compilation Pipeline#
raw/ raw materials → LLM/heuristic extraction → wiki/candidates/ pending review → approval queue → wiki/ persistent outputs + state/graph.json graph update + SQLite index update
Quick Start#
npm install -g @swarmvaultai/cli # Requires Node >= 24
swarmvault demo # Built-in demo vault, no API key needed
swarmvault scan ./your-repo # Scan a codebase
Unconfirmed Items#
- Official website
https://www.swarmvault.aireferenced as authorUrl in manifest.json but not verified - Obsidian plugin marketplace release status unconfirmed (manifest.json exists, minAppVersion 1.5.0)
- NPM package
@swarmvaultai/clipublish details not verified - Desktop app specific download URLs not verified
- MCP Server exposed tools/resources not detailed