Self-hosted alternative to LangSmith Deployments, an AI Agent backend service based on FastAPI and PostgreSQL with zero-code migration and full data sovereignty.
Positioning#
Aegra is a production-ready self-hosted AI Agent backend engine released under Apache License 2.0. Built on FastAPI and PostgreSQL, it serves as a zero-code drop-in replacement for LangSmith Deployments (formerly LangGraph Platform), enabling seamless migration of LangGraph-based Agents to private infrastructure, eliminating vendor lock-in and data compliance risks.
Aegra does not provide LLM inference itself — it requires external LLM APIs (e.g., OpenAI) — and focuses on Agent runtime orchestration, state management, streaming communication, and observability.
Core Capabilities#
- Protocol Compatibility: Compatible with Agent Protocol for plug-and-play with Agent Chat UI, LangGraph Studio, AG-UI / CopilotKit; natively compatible with LangGraph SDK — migration requires zero client code changes
- Worker Architecture: Redis-based job queue with 30 concurrent runs per instance; lease-based crash recovery; horizontal scaling across multiple instances
- Real-time Streaming: 8 SSE stream modes; cross-instance Pub/Sub; automatic reconnection and event replay
- Human-in-the-loop: Built-in approval gates and user intervention points
- Persistent State: PostgreSQL Checkpoint persistence via LangGraph
- Semantic Storage: Vector embedding storage via pgvector extension
- Configurable Auth: Supports JWT, OAuth, Firebase, or no-auth mode; fully customizable via Python Handler
- Unified Observability: Fan-out tracing via OpenTelemetry, connectable to any OTLP backend (Langfuse, Phoenix, etc.)
- Custom Routing: Add custom FastAPI endpoints alongside the Agent Protocol API
Architecture Overview#
Aegra uses a Monorepo workspace structure managed by uv, with the core split into the server (libs/aegra-api) and CLI tool (libs/aegra-cli).
- Access Layer: FastAPI-based, exposing Agent Protocol REST API with support for custom business route endpoints
- Orchestration & Execution Layer: Deep LangGraph integration for Agent state machine transitions, Subgraph support, and Checkpoint lifecycle management
- Data Persistence Layer: PostgreSQL for thread state, Checkpoint data, and Key-Value storage; pgvector extension for vector retrieval
- Async Scheduling & Communication Layer: Redis as core infrastructure for Worker task distribution, cross-instance SSE Pub/Sub, and lease-based crash recovery
- Observability Pipeline: OpenTelemetry standard protocol for internal metrics and trace collection, with fan-out to multiple external tracing backends
Deployment#
Supports Docker / Docker Compose native deployment, adaptable to PaaS and Kubernetes environments. In development mode, aegra dev automatically starts a built-in PostgreSQL instance.
Prerequisites: Python 3.12+, Docker.
CLI Installation (Recommended):
pip install aegra-cli
aegra init
cd <your-project>
cp .env.example .env
uv sync
uv run aegra dev
Install
aegra-cli, not theaegrameta-package (the latter does not support version locking).
Run from Source:
git clone https://github.com/aegra/aegra.git
cd aegra
cp .env.example .env
docker compose up
Access http://localhost:2026/docs for auto-generated API documentation after startup.
CLI Commands#
| Command | Description |
|---|---|
aegra init | Interactively initialize a new project |
aegra init ./my-agent | Create project at specified path |
aegra dev | Start dev server (hot reload + auto PostgreSQL) |
aegra serve | Start production server (no hot reload) |
aegra up | Build and start all Docker services |
aegra down | Stop Docker services |
aegra version | Display version info |
Configuration#
- File Config: Core parameters via
aegra.jsonin project root - Environment Variables: Sensitive info (e.g., LLM API Keys) and runtime variables via
.envfile - API Interface: Follows Agent Protocol standard with FastAPI Swagger documentation
Migration Example#
from langgraph_sdk import get_client
client = get_client(url="http://localhost:2026")
assistant = await client.assistants.create(graph_id="agent")
thread = await client.threads.create()
async for chunk in client.runs.stream(
thread_id=thread["thread_id"],
assistant_id=assistant["assistant_id"],
input={"messages": [{"type": "human", "content": "Hello!"}]},
):
print(chunk)
Use Cases#
- Teams migrating from LangSmith Deployments to self-hosted environments with zero code changes
- Enterprise Agent deployments with strict data residency compliance requirements
- Production-grade workloads requiring high availability, multi-tenancy, and horizontal scaling
- R&D teams needing deep integration of Agent runtime tracing with custom OpenTelemetry backends (Langfuse, Phoenix)
- Local rapid development and debugging of LangGraph Agents