Thoth

A local-first personal AI assistant achieving full data sovereignty and privacy protection through a LangGraph ReAct agent and personal knowledge graph, with all inference running locally via Ollama by default.

Core Positioning#

Thoth is a desktop AI assistant for personal users built on a purely local architecture—no accounts, no servers, no telemetry. All LLM inference runs locally via Ollama by default (including CPU-only mode), addressing privacy and data sovereignty concerns of cloud AI services.

Agent Core#

LangGraph-based ReAct agent with autonomous tool calling and streaming responses
Intelligent context management: auto-summarization + hard trimming + dynamic tool budgets
Destructive operations require user confirmation (interrupt mechanism); cancellation checks between every node for graceful stops
30 core tool modules + plugins + external MCP tools + auto-generated channel tools

Personal Knowledge Graph & Memory#

10 entity types (person, preference, fact, event, place, project, organisation, concept, skill, media)
67 valid relationship types + 60+ alias mappings
FAISS semantic retrieval (Qwen3-Embedding-0.6B) + 1-hop graph expansion auto-recall
Interactive knowledge graph visualization (full graph / ego-graph toggle)
Obsidian-compatible Wiki Vault export (.md + YAML frontmatter + [[wiki-links]])

Dream Cycle (Nighttime Knowledge Graph Refinement)#

5-stage background daemon: duplicate merging (≥0.93 similarity) → description enrichment → confidence decay → relationship inference → insight generation. Includes 3-layer anti-contamination mechanism, dream journal, Ollama busy detection, configurable silent window (default 1-5 AM).

Document Knowledge Extraction#

Three-stage map-reduce LLM pipeline: Map (~6K char window summarization) → Reduce (300-600 word synthesis) → Extract (structured entity + relationship extraction). Supports PDF, DOCX, TXT, Markdown, HTML, EPUB with source provenance tracking, cross-window deduplication, and cross-source merge protection.

Designer Studio#

5 design modes: deck / document / landing / app_mockup / storyboard. Sandbox interactive runtime, critique-fix loop, editable export (PDF / HTML / PNG / PPTX), publishable interactive link sharing.

Multimodal Capabilities#

Voice: Local faster-whisper STT + Kokoro TTS (10 voices), audio never leaves the device
Vision: Camera capture, screen capture, workspace image analysis
Image/Video Generation: Connected to OpenAI, Google (Imagen 4 / Veo), xAI (Grok Imagine / Grok Imagine Video)

Automation & Connectivity#

Tools: Search (Tavily/DuckDuckGo/Wikipedia/Arxiv/YouTube/URL Reader), Gmail/Calendar, sandboxed filesystem, Shell (3-level security), Chromium browser automation, X (Twitter) OAuth 2.0 PKCE, health tracking (medication/symptoms/exercise/mood/sleep + Plotly charts)
Workflow Engine: 7 schedule types (daily/weekly/weekdays/weekends/interval/cron/delay), Webhook triggers, conditional branches, approval steps, subtasks, concurrency groups, per-workflow model/tool overrides
Message Channels: 5 built-in channels (Telegram, WhatsApp, Discord, Slack, SMS/Twilio) with streaming, emoji reactions, approval routing, Tunnel (ngrok) management
MCP Client: Supports stdio / Streamable HTTP / SSE transport protocols for connecting external tool servers

Privacy & Security#

No accounts, no servers, no telemetry
API keys stored in OS credential storage (Windows Credential Manager / macOS Keychain / Linux Secret Service)
Filesystem sandbox (workspace folder only)
5-layer prompt injection defense: instruction override detection, role impersonation, data leakage, encoding evasion, social engineering

Architecture Overview#

Frontend: NiceGUI (Web UI) + pywebview (desktop native window + system tray). Core agent: LangGraph ReAct Agent. Data layer: SQLite (memory/conversation threads) + FAISS (semantic vectors) + NetworkX (graph structure). All persistent data stored in ~/.thoth/.

Model Support#

39 pre-filtered Ollama tool-calling models, default brain model qwen3:14b (~9 GB)
Optional providers: OpenAI, Anthropic, Google AI, xAI, OpenRouter, ChatGPT/Codex
Per-thread, per-workflow dynamic model overrides

Installation#

Windows: Download .exe installer—auto-installs Python, Ollama, and all dependencies
macOS: Download DMG, drag to Applications, first run auto-configures Homebrew/Python/Ollama
Source: git clone, pip install -r requirements.txt, start Ollama, then python launcher.py

System Requirements#

Python 3.11+
Minimum 8 GB RAM (8B model), recommended 16-32 GB (14B-30B models)
GPU optional (CPU supported), recommended NVIDIA 8+ GB VRAM or Apple Silicon

Known Limitations#

Image/video generation still requires external APIs, not fully local
No standalone website or web deployment; positioned as desktop single-machine application
Linux desktop support not explicitly verified
ChatGPT/Codex integration noted as potentially unstable with upstream changes