An MCP server that optimizes context windows for AI coding agents through sandboxed execution, session persistence, and output compression — reducing context consumption by up to 98%.
Context Mode is a context window optimization tool for AI coding agents, running as an MCP server. It addresses context overflow and state loss in long conversations.
Core Mechanisms#
Sandboxed Tool Execution (Context Saving): Isolates raw data outside the context window; only stdout enters the conversation. Measured 315 KB → 5.4 KB, ~98% context reduction. Supports 11 language runtimes: JavaScript, TypeScript, Python, Shell, Ruby, Go, Rust, PHP, Perl, R, Elixir.
Session Continuity: Every file edit, git operation, task, error, and user decision is persisted to a per-project SQLite database. After compact, FTS5 full-text indexing + BM25 search retrieves only relevant information to restore work state. Supports --continue for cross-session recovery; non-continue sessions auto-delete history. Relies on 4 hooks: PreToolUse, PostToolUse, PreCompact, SessionStart.
Output Compression: Prompt-level rules reduce LLM output redundancy (filler words, pleasantries, verbose explanations), achieving ~65-75% output token reduction while maintaining technical accuracy.
Think in Code: Forces a paradigm shift — LLM writes scripts to process data instead of reading data into context. ctx_execute("javascript", ...) replaces multiple Read() calls, saving ~100x context.
Knowledge Base#
Built-in SQLite FTS5 full-text indexing with BM25 ranking. Dual strategy + RRF (Reciprocal Rank Fusion): Porter stemmer matching + Trigram substring matching. Supports Proximity Reranking (adjacent matches in multi-word queries get weighted boosts), Fuzzy Correction (Levenshtein distance auto-correction), and Smart Snippets (intelligent window extraction around match terms). URL indexes use 24-hour TTL cache with 14-day auto-cleanup.
Tool Set#
6 Sandbox Tools:
ctx_batch_execute: Run multiple commands + searches in one call (1-8 concurrency), 986 KB → 62 KBctx_execute: 11-language sandbox execution, 56 KB → 299 Bctx_execute_file: File sandbox processing, 45 KB → 155 Bctx_index: Markdown chunking by heading + FTS5 indexing, 60 KB → 40 Bctx_search: Multi-query parallel search of indexed contentctx_fetch_and_index: Fetch URL → markdown → chunk + index, 24h TTL cache
5 Meta Tools: ctx_stats (statistics), ctx_doctor (diagnostics), ctx_upgrade (upgrade from GitHub), ctx_purge (clear all indexes), ctx_insight (open analysis dashboard)
Intent-Driven Filtering: When output exceeds 5 KB and an intent parameter is provided, auto-indexes full text → searches by intent → returns only relevant matches.
Progressive Throttling: Calls 1-3 return normally, 4-8 reduce results + warn, 9+ redirect to ctx_batch_execute.
Insight Dashboard#
Local Web UI analytics panel: 90 metrics, 37 insight patterns, 4 composite scores (productivity, quality, delegation, context health), 23 event categories.
Platform Adaptation#
Adapted for 14 mainstream AI coding agent platforms: Claude Code (plugin marketplace auto-install), Gemini CLI, VS Code Copilot, JetBrains Copilot, Cursor, Kiro, OpenCode, KiloCode, Codex CLI, Qwen Code, Antigravity, Zed, OpenClaw, Pi Agent. Platforms without hooks (e.g., Antigravity, Zed) fall back to MCP-only mode with session continuity unavailable.
Architecture Overview#
MCP server based on @modelcontextprotocol/sdk, written in TypeScript, bundled with esbuild into server.bundle.mjs. Entry points: src/server.ts (MCP server), src/cli.ts (CLI), src/session/ (session management with extract, snapshot, db modules). Data stored in per-project SQLite databases under ~/.context-mode/content/. SQLite backend auto-selection: Bun → bun:sqlite, Linux + Node.js ≥ 22.13 → node:sqlite, others → better-sqlite3. Insight Dashboard frontend uses Vite + Tailwind CSS. Auth CLI tools (gh, aws, gcloud, kubectl, docker) inherit credentials via environment variables and config paths without exposing them to conversation. Testing uses vitest + benchmark.
Unconfirmed Items#
- Author Mert Koseoğlu's organizational affiliation is unclear
- Production deployment scale unspecified (README shows logos without naming companies)
- Elastic License 2.0 prohibits use as a hosted service for third parties; SaaS scenarios need compliance review
- 98% context reduction baseline conditions documented in BENCHMARK.md but not independently verified
- No standalone documentation site; README is the sole documentation source