code-review-graph

A local code knowledge graph tool built on Tree-sitter that provides precise context to AI coding assistants via MCP, achieving an average 8.2× token reduction with support for 24 programming languages.

code-review-graph parses code repositories into structured knowledge graphs—nodes covering functions, classes, and imports; edges covering call relationships, inheritance, and test coverage—all stored in a local SQLite database with zero external dependencies. The core parsing engine is built on Tree-sitter for multi-language AST parsing, supporting 24 languages including Python, TypeScript, Go, Rust, Java, plus Jupyter/Databricks notebooks.

On top of the graph, the project implements rich graph analysis capabilities: blast-radius analysis traces all callers, dependencies, and affected tests from changed files; Leiden community detection automatically clusters related code modules; betweenness centrality identifies hub nodes and architectural bottlenecks; surprise scoring detects unexpected cross-community and cross-language coupling; knowledge gap analysis locates isolated nodes and untested hotspots. For search, hybrid retrieval combining FTS5 BM25 with vector embeddings via Reciprocal Rank Fusion is supported, with optional embedding backends including sentence-transformers local models or remote APIs.

The incremental update mechanism uses git diff and SHA-256 file hashing to re-parse only changed portions, with benchmarks showing a 2,900-file project re-indexing in under 2 seconds. Through the FastMCP framework, 28 tool interfaces are exposed alongside 5 built-in prompt templates (review_changes, architecture_map, debug_issue, onboard_developer, pre_merge_check). Visualization is provided via D3.js force-directed graphs with search, community legends, and degree-scaled nodes; exports support GraphML, Neo4j Cypher, Obsidian wikilinks, and SVG. The crg-daemon can simultaneously monitor multiple repositories.

In benchmarks across 6 real open-source repositories, token reduction factors were: Next.js 10.8×, httpx 9.3×, FastAPI 8×, gin 6.7×, Flask 6×, averaging 8.2×. Daily coding scenarios reached up to 49× reduction. In a large monorepo case, 27,700+ files were excluded from review context, with only ~15 files read. Built with Python 90.8% + TypeScript 8.8%, requiring Python 3.10+, the project is ready in three steps: pip install code-review-graph && code-review-graph install && code-review-graph build, where install auto-detects and configures MCP settings for all installed AI platforms including Claude Code, Cursor, and Copilot.

The project is MIT-licensed with 440+ commits, 80 contributors, 23 releases (latest v2.3.2), and 570+ tests.

Related Projects

Genkit

Gobii Platform

Semble

STAY UPDATED