A local proxy that automatically routes each LLM request to the cheapest still-capable model
UncommonRoute is a fully local LLM model routing proxy designed to significantly reduce LLM spend without sacrificing task completion rates. It routes at per-request / per-agent-step granularity using an ensemble voting mechanism across three signal types — Metadata (<1ms), Embedding (BGE-based classifier, ~25–35ms CPU warm), and Structural (<1ms) — automatically assigning each request to the cheapest capable model within economy / balanced / premium tiers.
Intelligent Routing Engine
- Multi-signal ensemble voting: Metadata, Embedding, and Structural signals fused for decision-making
- Embedding classifier: BGE-based, classifies user requests, recent agent state, and metadata with KNN fallback on uncertainty
- Three routing modes:
auto(balanced),fast(cost-first),best(quality-first) - Adaptive learning: local feedback adjusts signal weights; high-confidence consistency expands embedding index, low-confidence predictions escalate rather than silently downgrade
Proxy & Compatibility
- Drop-in OpenAI proxy: compatible with
/v1/chat/completions, directly replaceableOPENAI_BASE_URL - Drop-in Anthropic proxy: compatible with Anthropic API, directly replaceable
ANTHROPIC_BASE_URL - Verified clients: Claude Code, Codex, Cursor, OpenAI SDK, OpenClaw (plugin mode, unconfirmed)
Operations & Governance
- Spend limits: daily caps via
uncommon-route spend set daily 20.00 - Real-time dashboard:
http://localhost:8403/dashboard/with monitoring, playground, cost tracking, routing config - Diagnostic support bundle:
uncommon-route support bundleexports logs/traces/config snapshots - Privacy-first: data stays local by default, telemetry is opt-in
Upstream Providers: 8 providers supported (commonstack, openai, anthropic, google, xai, minimax, moonshot, deepseek). Connection paths: Commonstack hosted, local/custom upstream, or BYOK.
Benchmark Performance (CommonRouterBench, 970 real agent trajectories, held-out 196):
| Metric | Value |
|---|---|
| Task pass rate | 91.8% |
| Tier match accuracy | 74.0% |
| Cost-savings score | 81.9 |
| Overall score | 76.7 |
| Warm routing overhead | p50 25.6ms / p90 32.1ms (CPU) |
Quick Start
pipx install uncommon-route
uncommon-route init
uncommon-route doctor
uncommon-route serve
Client config examples:
- Claude Code:
export ANTHROPIC_BASE_URL="http://localhost:8403" - Codex / Cursor / OpenAI SDK:
export OPENAI_BASE_URL="http://localhost:8403/v1"
Architecture Highlights
- Core package
uncommon_route/(Python 89%), frontend dashboardfrontend/(TypeScript 8.7%) - Routing flow: request arrives → parallel 3-signal extraction → ensemble vote determines tier → select cheapest model per mode → forward to upstream → collect feedback for weight adjustment
- v1 achieved only 43% accuracy in real multi-turn agent conversations; v2 rebuilt from scratch, reaching 74% tier match accuracy
- Package management:
pyproject.toml+uv.lock, Docker containerization supported
Unconfirmed Information: Commonstack AI team background has no detailed public info; specific BGE embedding model version unspecified; "Commonstack" hosted upstream pricing/terms not public; dashboard UI has no screenshots for verification; OpenClaw integration details unconfirmed.