An open-source, end-to-end platform for evaluating, observing, and improving LLM & AI Agent applications, unifying Tracing, Evals, Simulations, Guardrails, Gateway, and Prompt Optimization.
Future AGI is an open-source platform addressing the full lifecycle of LLM and AI Agent applications, consolidating six pillar capabilities that traditionally require multiple disjointed tools:
Simulate: Thousands of multi-turn conversation simulations covering real personas, adversarial inputs, and edge cases, with support for both text and voice modalities (LiveKit, VAPI, Retell, Pipecat).
Evaluate: 50+ built-in metrics (groundedness, hallucination, tool call correctness, PII, tone, etc.) with three evaluation paradigms—LLM-as-judge, heuristic, and ML—accessible via a single evaluate() call.
Protect: 18 built-in scanners and 15 vendor adapters (Lakera, Presidio, Llama Guard, etc.), embeddable in the Gateway pipeline or usable as a standalone SDK.
Monitor: Native OpenTelemetry tracing with zero-config auto-instrumentation for 50+ frameworks (LangChain, LlamaIndex, CrewAI, DSPy, etc.), featuring span graphs, latency analysis, token cost tracking, and real-time dashboards.
Agent Command Center: An OpenAI-compatible Gateway connecting 100+ LLM providers with 15 routing strategies, semantic caching, virtual keys, and support for MCP and A2A protocols. Built in Go, achieving ~29k req/s with P99 ≤ 21ms.
Optimize: 6 prompt optimization algorithms (GEPA, PromptWizard, ProTeGi, Bayesian, Meta-Prompt, Random), with production trace data directly feedable as training data.
The architecture employs Django + Go Gateway + React frontend, with PostgreSQL for metadata, ClickHouse for time-series trace data, Redis for caching, and RabbitMQ + Temporal for async orchestration. All components communicate via open interfaces (OTLP, OpenAI-compatible HTTP, SQL), allowing substitution at any layer. Supports Docker self-hosted, Kubernetes (Helm), and Cloud deployments, with a dual-language SDK ecosystem in Python and npm.
Currently in the Nightly release phase (v0.5.4), with a stable version coming soon. The Cloud edition offers SOC 2 Type II and HIPAA compliance. Self-hosted versions only send anonymous usage counts—no trace data, prompts, or API keys—and telemetry can be fully disabled via environment variables.