Aegra

Self-hosted alternative to LangSmith Deployments, an AI Agent backend service based on FastAPI and PostgreSQL with zero-code migration and full data sovereignty.

Positioning#

Aegra is a production-ready self-hosted AI Agent backend engine released under Apache License 2.0. Built on FastAPI and PostgreSQL, it serves as a zero-code drop-in replacement for LangSmith Deployments (formerly LangGraph Platform), enabling seamless migration of LangGraph-based Agents to private infrastructure, eliminating vendor lock-in and data compliance risks.

Aegra does not provide LLM inference itself — it requires external LLM APIs (e.g., OpenAI) — and focuses on Agent runtime orchestration, state management, streaming communication, and observability.

Core Capabilities#

Protocol Compatibility: Compatible with Agent Protocol for plug-and-play with Agent Chat UI, LangGraph Studio, AG-UI / CopilotKit; natively compatible with LangGraph SDK — migration requires zero client code changes
Worker Architecture: Redis-based job queue with 30 concurrent runs per instance; lease-based crash recovery; horizontal scaling across multiple instances
Real-time Streaming: 8 SSE stream modes; cross-instance Pub/Sub; automatic reconnection and event replay
Human-in-the-loop: Built-in approval gates and user intervention points
Persistent State: PostgreSQL Checkpoint persistence via LangGraph
Semantic Storage: Vector embedding storage via pgvector extension
Configurable Auth: Supports JWT, OAuth, Firebase, or no-auth mode; fully customizable via Python Handler
Unified Observability: Fan-out tracing via OpenTelemetry, connectable to any OTLP backend (Langfuse, Phoenix, etc.)
Custom Routing: Add custom FastAPI endpoints alongside the Agent Protocol API

Architecture Overview#

Aegra uses a Monorepo workspace structure managed by uv, with the core split into the server (libs/aegra-api) and CLI tool (libs/aegra-cli).

Access Layer: FastAPI-based, exposing Agent Protocol REST API with support for custom business route endpoints
Orchestration & Execution Layer: Deep LangGraph integration for Agent state machine transitions, Subgraph support, and Checkpoint lifecycle management
Data Persistence Layer: PostgreSQL for thread state, Checkpoint data, and Key-Value storage; pgvector extension for vector retrieval
Async Scheduling & Communication Layer: Redis as core infrastructure for Worker task distribution, cross-instance SSE Pub/Sub, and lease-based crash recovery
Observability Pipeline: OpenTelemetry standard protocol for internal metrics and trace collection, with fan-out to multiple external tracing backends

Deployment#

Supports Docker / Docker Compose native deployment, adaptable to PaaS and Kubernetes environments. In development mode, aegra dev automatically starts a built-in PostgreSQL instance.

Prerequisites: Python 3.12+, Docker.

CLI Installation (Recommended):

pip install aegra-cli
aegra init
cd <your-project>
cp .env.example .env
uv sync
uv run aegra dev

Install aegra-cli, not the aegra meta-package (the latter does not support version locking).

Run from Source:

git clone https://github.com/aegra/aegra.git
cd aegra
cp .env.example .env
docker compose up

Access http://localhost:2026/docs for auto-generated API documentation after startup.

CLI Commands#

Command	Description
`aegra init`	Interactively initialize a new project
`aegra init ./my-agent`	Create project at specified path
`aegra dev`	Start dev server (hot reload + auto PostgreSQL)
`aegra serve`	Start production server (no hot reload)
`aegra up`	Build and start all Docker services
`aegra down`	Stop Docker services
`aegra version`	Display version info

Configuration#

File Config: Core parameters via aegra.json in project root
Environment Variables: Sensitive info (e.g., LLM API Keys) and runtime variables via .env file
API Interface: Follows Agent Protocol standard with FastAPI Swagger documentation

Migration Example#

from langgraph_sdk import get_client

client = get_client(url="http://localhost:2026")
assistant = await client.assistants.create(graph_id="agent")
thread = await client.threads.create()

async for chunk in client.runs.stream(
    thread_id=thread["thread_id"],
    assistant_id=assistant["assistant_id"],
    input={"messages": [{"type": "human", "content": "Hello!"}]},
):
    print(chunk)

Use Cases#

Teams migrating from LangSmith Deployments to self-hosted environments with zero code changes
Enterprise Agent deployments with strict data residency compliance requirements
Production-grade workloads requiring high availability, multi-tenancy, and horizontal scaling
R&D teams needing deep integration of Agent runtime tracing with custom OpenTelemetry backends (Langfuse, Phoenix)
Local rapid development and debugging of LangGraph Agents

Positioning#

Core Capabilities#

Architecture Overview#

Deployment#

CLI Commands#

Configuration#

Migration Example#

Use Cases#

Related Projects

Genkit

Gobii Platform

Semble

STAY UPDATED