A local-first personal AI assistant achieving full data sovereignty and privacy protection through a LangGraph ReAct agent and personal knowledge graph, with all inference running locally via Ollama by default.
Core Positioning#
Thoth is a desktop AI assistant for personal users built on a purely local architecture—no accounts, no servers, no telemetry. All LLM inference runs locally via Ollama by default (including CPU-only mode), addressing privacy and data sovereignty concerns of cloud AI services.
Agent Core#
- LangGraph-based ReAct agent with autonomous tool calling and streaming responses
- Intelligent context management: auto-summarization + hard trimming + dynamic tool budgets
- Destructive operations require user confirmation (interrupt mechanism); cancellation checks between every node for graceful stops
- 30 core tool modules + plugins + external MCP tools + auto-generated channel tools
Personal Knowledge Graph & Memory#
- 10 entity types (person, preference, fact, event, place, project, organisation, concept, skill, media)
- 67 valid relationship types + 60+ alias mappings
- FAISS semantic retrieval (Qwen3-Embedding-0.6B) + 1-hop graph expansion auto-recall
- Interactive knowledge graph visualization (full graph / ego-graph toggle)
- Obsidian-compatible Wiki Vault export (.md + YAML frontmatter + [[wiki-links]])
Dream Cycle (Nighttime Knowledge Graph Refinement)#
5-stage background daemon: duplicate merging (≥0.93 similarity) → description enrichment → confidence decay → relationship inference → insight generation. Includes 3-layer anti-contamination mechanism, dream journal, Ollama busy detection, configurable silent window (default 1-5 AM).
Document Knowledge Extraction#
Three-stage map-reduce LLM pipeline: Map (~6K char window summarization) → Reduce (300-600 word synthesis) → Extract (structured entity + relationship extraction). Supports PDF, DOCX, TXT, Markdown, HTML, EPUB with source provenance tracking, cross-window deduplication, and cross-source merge protection.
Designer Studio#
5 design modes: deck / document / landing / app_mockup / storyboard. Sandbox interactive runtime, critique-fix loop, editable export (PDF / HTML / PNG / PPTX), publishable interactive link sharing.
Multimodal Capabilities#
- Voice: Local faster-whisper STT + Kokoro TTS (10 voices), audio never leaves the device
- Vision: Camera capture, screen capture, workspace image analysis
- Image/Video Generation: Connected to OpenAI, Google (Imagen 4 / Veo), xAI (Grok Imagine / Grok Imagine Video)
Automation & Connectivity#
- Tools: Search (Tavily/DuckDuckGo/Wikipedia/Arxiv/YouTube/URL Reader), Gmail/Calendar, sandboxed filesystem, Shell (3-level security), Chromium browser automation, X (Twitter) OAuth 2.0 PKCE, health tracking (medication/symptoms/exercise/mood/sleep + Plotly charts)
- Workflow Engine: 7 schedule types (daily/weekly/weekdays/weekends/interval/cron/delay), Webhook triggers, conditional branches, approval steps, subtasks, concurrency groups, per-workflow model/tool overrides
- Message Channels: 5 built-in channels (Telegram, WhatsApp, Discord, Slack, SMS/Twilio) with streaming, emoji reactions, approval routing, Tunnel (ngrok) management
- MCP Client: Supports stdio / Streamable HTTP / SSE transport protocols for connecting external tool servers
Privacy & Security#
- No accounts, no servers, no telemetry
- API keys stored in OS credential storage (Windows Credential Manager / macOS Keychain / Linux Secret Service)
- Filesystem sandbox (workspace folder only)
- 5-layer prompt injection defense: instruction override detection, role impersonation, data leakage, encoding evasion, social engineering
Architecture Overview#
Frontend: NiceGUI (Web UI) + pywebview (desktop native window + system tray). Core agent: LangGraph ReAct Agent. Data layer: SQLite (memory/conversation threads) + FAISS (semantic vectors) + NetworkX (graph structure). All persistent data stored in ~/.thoth/.
Model Support#
- 39 pre-filtered Ollama tool-calling models, default brain model qwen3:14b (~9 GB)
- Optional providers: OpenAI, Anthropic, Google AI, xAI, OpenRouter, ChatGPT/Codex
- Per-thread, per-workflow dynamic model overrides
Installation#
- Windows: Download .exe installer—auto-installs Python, Ollama, and all dependencies
- macOS: Download DMG, drag to Applications, first run auto-configures Homebrew/Python/Ollama
- Source: git clone, pip install -r requirements.txt, start Ollama, then python launcher.py
System Requirements#
- Python 3.11+
- Minimum 8 GB RAM (8B model), recommended 16-32 GB (14B-30B models)
- GPU optional (CPU supported), recommended NVIDIA 8+ GB VRAM or Apple Silicon
Known Limitations#
- Image/video generation still requires external APIs, not fully local
- No standalone website or web deployment; positioned as desktop single-machine application
- Linux desktop support not explicitly verified
- ChatGPT/Codex integration noted as potentially unstable with upstream changes