An agent-powered vulnerability scanner for large codebases, featuring multi-stage pipelines, incremental recovery, and distributed execution.
deepsec is an agent-powered vulnerability scanner developed by Vercel Labs (led by Malte Ubl / cramforce), designed for on-demand deep security audits of large application and service codebases. Licensed under Apache 2.0, built with TypeScript (99.3%), and managed as a pnpm workspace monorepo.
Multi-Stage Scanning Pipeline#
The project employs a "regex pre-scan → AI deep investigation → revalidation" architecture:
- scan: Pure regex matching of candidate security-sensitive sites, no AI calls, ~15s/2k files
- process: Invokes Claude (Opus 4.7 max effort) or Codex (GPT 5.5 xhigh reasoning) for data flow tracing and mitigation checks, producing severity-rated findings with fix suggestions
- revalidate: Second AI review checking git history for prior fixes, reducing false positives by ~50%+
- enrich: Attaches git committer information and ownership data
- export: Outputs Markdown or JSON format reports
Baseline false positive rate is ~10-20%, further reduced after revalidation. Both Claude and Codex backends share the same prompt and JSON output schema, and can be mixed.
Runtime Resilience & Distributed Execution#
- Idempotent design: Each stage reads/writes a consistent disk representation; re-runs merge new information rather than overwriting, with a purely append-only merge model
- Incremental recovery: Re-running the same command after interruption automatically skips already-analyzed files and resumes from the checkpoint
- Concurrency safety: Atomic file claiming via
lockedByRunId, enabling conflict-free parallel multi-worker execution - Distributed execution: Fan-out to massive parallelism via Vercel Sandbox micro-VMs (Vercel itself has scaled to 1000+ concurrent sandboxes); in sandbox mode, API keys are injected externally and network egress is restricted to coding agent hosts
CLI Commands#
| Command | Function |
|---|---|
scan | Regex matching of candidate security-sensitive sites |
process | AI deep investigation, producing findings + suggestions |
process --diff | PR mode: scan only diff-changed files, suitable for CI gating |
triage | Lightweight P0/P1/P2 classification (using cheaper models) |
revalidate | Second review of existing findings |
enrich | Add git committer info + ownership data |
report | Single-project Markdown + JSON summary |
export | Per-finding JSON or Markdown file directory |
metrics | Cross-project statistics: severity, vulnerability types, true positive rate |
sandbox <cmd> | Execute any command on Vercel Sandbox micro-VMs |
Plugin System#
Six extension points: matchers, notifiers, agents, ownership, people, executor (defined in packages/core/src/plugin.ts), supporting custom scan matchers and notifiers.
AI Provider Configuration#
Priority from high to low:
- Explicit environment variables:
ANTHROPIC_AUTH_TOKEN+ANTHROPIC_BASE_URLor OpenAI equivalents - Vercel AI Gateway (recommended for production):
AI_GATEWAY_API_KEY=vck_..., one key covering both Claude and Codex with default quotas tuned for high concurrency - Local fallback: Automatically uses logged-in claude/codex subscriptions
Quick Start#
npx deepsec init # Create .deepsec/ directory
cd .deepsec
pnpm install
pnpm deepsec scan
pnpm deepsec process
pnpm deepsec revalidate # Optional
pnpm deepsec export --format md-dir --out ./findings
Use Cases & Known Limitations#
- Large monorepo full security audits (used in production by Vercel itself, Unkey, and dub.co)
- PR-level security gating and CI integration
- Hidden vulnerability mining in legacy code
- Best suited for application and service codebases; library/framework projects require custom prompts and scanners, with unverified applicability
- Single large-scale scan costs may reach thousands to tens of thousands of dollars, but no precise quantification formula is available
- Distributed execution strongly depends on Vercel Sandbox; alternatives for non-Vercel infrastructure are not documented
- npm package publication status is unconfirmed (GitHub Releases/Packages pages both show empty, possibly pre-release)