UncommonRoute

A local proxy that automatically routes each LLM request to the cheapest still-capable model

UncommonRoute is a fully local LLM model routing proxy designed to significantly reduce LLM spend without sacrificing task completion rates. It routes at per-request / per-agent-step granularity using an ensemble voting mechanism across three signal types — Metadata (<1ms), Embedding (BGE-based classifier, ~25–35ms CPU warm), and Structural (<1ms) — automatically assigning each request to the cheapest capable model within economy / balanced / premium tiers.

Intelligent Routing Engine

Multi-signal ensemble voting: Metadata, Embedding, and Structural signals fused for decision-making
Embedding classifier: BGE-based, classifies user requests, recent agent state, and metadata with KNN fallback on uncertainty
Three routing modes: auto (balanced), fast (cost-first), best (quality-first)
Adaptive learning: local feedback adjusts signal weights; high-confidence consistency expands embedding index, low-confidence predictions escalate rather than silently downgrade

Proxy & Compatibility

Drop-in OpenAI proxy: compatible with /v1/chat/completions, directly replaceable OPENAI_BASE_URL
Drop-in Anthropic proxy: compatible with Anthropic API, directly replaceable ANTHROPIC_BASE_URL
Verified clients: Claude Code, Codex, Cursor, OpenAI SDK, OpenClaw (plugin mode, unconfirmed)

Operations & Governance

Spend limits: daily caps via uncommon-route spend set daily 20.00
Real-time dashboard: http://localhost:8403/dashboard/ with monitoring, playground, cost tracking, routing config
Diagnostic support bundle: uncommon-route support bundle exports logs/traces/config snapshots
Privacy-first: data stays local by default, telemetry is opt-in

Upstream Providers: 8 providers supported (commonstack, openai, anthropic, google, xai, minimax, moonshot, deepseek). Connection paths: Commonstack hosted, local/custom upstream, or BYOK.

Benchmark Performance (CommonRouterBench, 970 real agent trajectories, held-out 196):

Metric	Value
Task pass rate	91.8%
Tier match accuracy	74.0%
Cost-savings score	81.9
Overall score	76.7
Warm routing overhead	p50 25.6ms / p90 32.1ms (CPU)

Quick Start

pipx install uncommon-route
uncommon-route init
uncommon-route doctor
uncommon-route serve

Client config examples:

Claude Code: export ANTHROPIC_BASE_URL="http://localhost:8403"
Codex / Cursor / OpenAI SDK: export OPENAI_BASE_URL="http://localhost:8403/v1"

Architecture Highlights

Core package uncommon_route/ (Python 89%), frontend dashboard frontend/ (TypeScript 8.7%)
Routing flow: request arrives → parallel 3-signal extraction → ensemble vote determines tier → select cheapest model per mode → forward to upstream → collect feedback for weight adjustment
v1 achieved only 43% accuracy in real multi-turn agent conversations; v2 rebuilt from scratch, reaching 74% tier match accuracy
Package management: pyproject.toml + uv.lock, Docker containerization supported

Unconfirmed Information: Commonstack AI team background has no detailed public info; specific BGE embedding model version unspecified; "Commonstack" hosted upstream pricing/terms not public; dashboard UI has no screenshots for verification; OpenClaw integration details unconfirmed.

Related Projects

Genkit

Gobii Platform

Semble

STAY UPDATED