DISCOVER THE FUTURE OF AI AGENTS

Thoth

Added May 4, 2026
Agent & Tooling
Open Source
Desktop AppsLangGraphModel Context ProtocolMultimodalAI AgentsAgent FrameworkAgent & ToolingModel & Inference FrameworkKnowledge Management, Retrieval & RAGSecurity & Privacy

A local-first personal AI assistant achieving full data sovereignty and privacy protection through a LangGraph ReAct agent and personal knowledge graph, with all inference running locally via Ollama by default.

Core Positioning#

Thoth is a desktop AI assistant for personal users built on a purely local architecture—no accounts, no servers, no telemetry. All LLM inference runs locally via Ollama by default (including CPU-only mode), addressing privacy and data sovereignty concerns of cloud AI services.

Agent Core#

  • LangGraph-based ReAct agent with autonomous tool calling and streaming responses
  • Intelligent context management: auto-summarization + hard trimming + dynamic tool budgets
  • Destructive operations require user confirmation (interrupt mechanism); cancellation checks between every node for graceful stops
  • 30 core tool modules + plugins + external MCP tools + auto-generated channel tools

Personal Knowledge Graph & Memory#

  • 10 entity types (person, preference, fact, event, place, project, organisation, concept, skill, media)
  • 67 valid relationship types + 60+ alias mappings
  • FAISS semantic retrieval (Qwen3-Embedding-0.6B) + 1-hop graph expansion auto-recall
  • Interactive knowledge graph visualization (full graph / ego-graph toggle)
  • Obsidian-compatible Wiki Vault export (.md + YAML frontmatter + [[wiki-links]])

Dream Cycle (Nighttime Knowledge Graph Refinement)#

5-stage background daemon: duplicate merging (≥0.93 similarity) → description enrichment → confidence decay → relationship inference → insight generation. Includes 3-layer anti-contamination mechanism, dream journal, Ollama busy detection, configurable silent window (default 1-5 AM).

Document Knowledge Extraction#

Three-stage map-reduce LLM pipeline: Map (~6K char window summarization) → Reduce (300-600 word synthesis) → Extract (structured entity + relationship extraction). Supports PDF, DOCX, TXT, Markdown, HTML, EPUB with source provenance tracking, cross-window deduplication, and cross-source merge protection.

Designer Studio#

5 design modes: deck / document / landing / app_mockup / storyboard. Sandbox interactive runtime, critique-fix loop, editable export (PDF / HTML / PNG / PPTX), publishable interactive link sharing.

Multimodal Capabilities#

  • Voice: Local faster-whisper STT + Kokoro TTS (10 voices), audio never leaves the device
  • Vision: Camera capture, screen capture, workspace image analysis
  • Image/Video Generation: Connected to OpenAI, Google (Imagen 4 / Veo), xAI (Grok Imagine / Grok Imagine Video)

Automation & Connectivity#

  • Tools: Search (Tavily/DuckDuckGo/Wikipedia/Arxiv/YouTube/URL Reader), Gmail/Calendar, sandboxed filesystem, Shell (3-level security), Chromium browser automation, X (Twitter) OAuth 2.0 PKCE, health tracking (medication/symptoms/exercise/mood/sleep + Plotly charts)
  • Workflow Engine: 7 schedule types (daily/weekly/weekdays/weekends/interval/cron/delay), Webhook triggers, conditional branches, approval steps, subtasks, concurrency groups, per-workflow model/tool overrides
  • Message Channels: 5 built-in channels (Telegram, WhatsApp, Discord, Slack, SMS/Twilio) with streaming, emoji reactions, approval routing, Tunnel (ngrok) management
  • MCP Client: Supports stdio / Streamable HTTP / SSE transport protocols for connecting external tool servers

Privacy & Security#

  • No accounts, no servers, no telemetry
  • API keys stored in OS credential storage (Windows Credential Manager / macOS Keychain / Linux Secret Service)
  • Filesystem sandbox (workspace folder only)
  • 5-layer prompt injection defense: instruction override detection, role impersonation, data leakage, encoding evasion, social engineering

Architecture Overview#

Frontend: NiceGUI (Web UI) + pywebview (desktop native window + system tray). Core agent: LangGraph ReAct Agent. Data layer: SQLite (memory/conversation threads) + FAISS (semantic vectors) + NetworkX (graph structure). All persistent data stored in ~/.thoth/.

Model Support#

  • 39 pre-filtered Ollama tool-calling models, default brain model qwen3:14b (~9 GB)
  • Optional providers: OpenAI, Anthropic, Google AI, xAI, OpenRouter, ChatGPT/Codex
  • Per-thread, per-workflow dynamic model overrides

Installation#

  • Windows: Download .exe installer—auto-installs Python, Ollama, and all dependencies
  • macOS: Download DMG, drag to Applications, first run auto-configures Homebrew/Python/Ollama
  • Source: git clone, pip install -r requirements.txt, start Ollama, then python launcher.py

System Requirements#

  • Python 3.11+
  • Minimum 8 GB RAM (8B model), recommended 16-32 GB (14B-30B models)
  • GPU optional (CPU supported), recommended NVIDIA 8+ GB VRAM or Apple Silicon

Known Limitations#

  • Image/video generation still requires external APIs, not fully local
  • No standalone website or web deployment; positioned as desktop single-machine application
  • Linux desktop support not explicitly verified
  • ChatGPT/Codex integration noted as potentially unstable with upstream changes

Related Projects

View All

STAY UPDATED

Get the latest AI tools and trends delivered straight to your inbox. No spam, just intelligence.