DISCOVER THE FUTURE OF AI AGENTS

Token Optimizer

Added May 4, 2026
Agent & Tooling
Open Source
PythonNode.jsLarge Language ModelsCLIAgent & ToolingModel & Inference FrameworkDeveloper Tools & CodingData Analytics, BI & Visualization

Context token optimizer and visualizer for Claude Code / OpenClaw / Codex, featuring Active Compression, Smart Compaction, quality scoring, and cost dashboards with zero context overhead and zero network calls.

Problem Domain#

AI coding assistants face two categories of token waste in long sessions—"runtime waste" (redundant reads, unarchived large tool outputs, etc.) and "structural waste" (duplicate configs, unused skill front-matter, orphaned memory entries, dead MCP servers, etc., reportedly 75-85% of token consumption). Additionally, 60-70% of conversation content is lost to summarization during auto-compact; MRCR can degrade from 93% to 76% between 256K and 1M context, with no real-time degradation awareness.

Core Capabilities#

Active Compression (v5)#

7 independently toggleable compression features, all enabled by default:

FeatureDescriptionRiskEst. Savings
Quality NudgesQuality hintsNone
Loop DetectionLoop detectionNone
Delta ModeSmart re-readLow~20%
Structure MapLarge file re-read optimizationLowUp to 99%/file
Bash Compression16 processorsLow~10%
Activity Mode DetectionActivity mode detectionTBDTBD
Decision ExtractionDecision extractionTBD

Smart Compaction & Session Continuity#

  • Auto-establishes checkpoints before auto-compact triggers; restores content lost to summarization post-compression
  • Injects summaries of large tool outputs to avoid model re-reading
  • Progressive Checkpoints: auto-snapshots at 20%/35%/50%/65%/80% context fill and quality score drops to 80/70/50/40; selects richest available checkpoint on restore

Tool Result Management#

  • Tool Result Archive: tool outputs >4KB auto-archived to disk, replaced with inline preview + [expand <id>] hint
  • Model can call expand post-compression to recover full content

Quality Assessment#

  • 7-Signal Quality Scoring: context fill (20%), stale reads (20%), bloated results (20%), compaction depth (15%), duplication (10%), decision density (8%), agent efficiency (7%)
  • Outputs S/A/B/C/D/F efficiency grades

Visualization & Cost Tracking#

Single-file HTML dashboard auto-regenerates after each SessionEnd, covering:

  • Per-turn token breakdown (input/output/cache-read/cache-write + peak detection)
  • Cache analysis (TTL mix, hit rate)
  • 4 pricing tier costs (Anthropic API, Vertex Global, Vertex Regional, AWS Bedrock)
  • Quality score overlay (green/yellow/red)
  • Sub-agent cost breakdown
  • Skill adoption trends & model mix (Opus/Sonnet/Haiku)
  • CLAUDE.md / MEMORY.md health cards
  • Config drift detection & cumulative savings tracking

Structural Optimization#

Handles structural waste: duplicate configs, unused skill front-matter, orphaned memory entries, dead MCP servers.

Architecture Highlights#

  • External process model: no LLM context injection, no MCP overhead, no network calls (zero phone-home)
  • Core entry point: measure.py (pure Python stdlib); OpenClaw portion uses pure Node.js stdlib
  • Storage: local SQLite (~/.claude/_backups/token-optimizer/trends.db, etc.)
  • Hook mechanism: based on each target platform's hook system (hooks/ directory)
  • Checkpoints: no LLM calls, pure deterministic extraction + background daemon
  • Zero runtime dependencies: pure standard libraries, no pip/npm required

Installation & Quick Start#

Claude Code (recommended, all platforms):

/plugin marketplace add alexgreensh/token-optimizer
/plugin install token-optimizer@alexgreensh-token-optimizer
/token-optimizer

macOS / Linux alternative: use install.sh in repo root

Dashboard launch:

python3 measure.py setup-daemon           # daemon mode
python3 measure.py dashboard --serve      # one-shot HTTP server

Smart Compaction enable:

python3 measure.py setup-smart-compact

Use Cases#

  • Heavy Opus users with frequent long sessions (cited case: 30 days, 942 sessions, 6.13B input tokens, ~$590 monthly savings)
  • Teams or individuals needing per-turn cost visibility and sub-agent cost attribution
  • Scenarios where compaction causes context loss requiring recovery

Supported Platforms#

  • Claude Code (.claude-plugin)
  • OpenClaw (openclaw/ directory, Node.js)
  • Codex beta (.codex-plugin)
  • Windsurf / Cursor: planned, no timeline

Unconfirmed Information#

  • Specific version number requires git tag/commit check
  • Windows support details incomplete
  • Savings data and risk levels for Activity Mode Detection / Decision Extraction not provided
  • $590 monthly savings is a single-user snapshot, not a systematic benchmark
  • Minimum Claude Code version requirement for hook compatibility not specified

Related Projects

View All

STAY UPDATED

Get the latest AI tools and trends delivered straight to your inbox. No spam, just intelligence.