DISCOVER THE FUTURE OF AI AGENTS

exo

Added Apr 23, 2026
Model & Inference Framework
Open Source
PythonPyTorchLarge Language ModelsDeep LearningCLIModel & Inference FrameworkModel Training & InferenceProtocol, API & Integration

A distributed inference framework for running frontier LLMs across local device clusters, built on Apple MLX and libp2p, featuring automatic device discovery, topology-aware parallelism, and multi-API compatibility.

exo is a distributed LLM inference framework designed for local device clusters, enabling multiple consumer-grade devices to collaboratively run frontier models too large for a single machine (e.g., DeepSeek v3.1 671B, Qwen3-235B). Built on Apple MLX and MLX distributed for GPU-accelerated inference and cross-device communication, it leverages libp2p for zero-configuration automatic device discovery and cluster networking.

For parallelism, exo supports both tensor parallelism and pipeline parallelism, with a topology-aware algorithm that evaluates device resources and network conditions in real-time (including RDMA capabilities over Thunderbolt 5) to automatically select the optimal sharding strategy. Benchmarks show up to 1.8× speedup with 2 devices and 3.2× with 4 devices. On macOS 26.2+, exo offers Day-0 support for RDMA over Thunderbolt 5, reducing inter-device latency by approximately 99%.

For usability, exo simultaneously exposes OpenAI Chat Completions, Claude Messages, OpenAI Responses, and Ollama API formats, allowing direct integration with existing toolchains. A built-in web Dashboard provides management and chat interfaces. It supports offline mode, cluster namespace isolation, distributed tracing, and loading custom MLX models from HuggingFace Hub.

The current Tier 1 platform is macOS Apple Silicon, with Linux supporting CPU-only inference (GPU support in development). Installation is available via source build, Nix, and macOS .dmg.

Related Projects

View All

STAY UPDATED

Get the latest AI tools and trends delivered straight to your inbox. No spam, just intelligence.