deepsec

An agent-powered vulnerability scanner for large codebases, featuring multi-stage pipelines, incremental recovery, and distributed execution.

deepsec is an agent-powered vulnerability scanner developed by Vercel Labs (led by Malte Ubl / cramforce), designed for on-demand deep security audits of large application and service codebases. Licensed under Apache 2.0, built with TypeScript (99.3%), and managed as a pnpm workspace monorepo.

Multi-Stage Scanning Pipeline#

The project employs a "regex pre-scan → AI deep investigation → revalidation" architecture:

scan: Pure regex matching of candidate security-sensitive sites, no AI calls, ~15s/2k files
process: Invokes Claude (Opus 4.7 max effort) or Codex (GPT 5.5 xhigh reasoning) for data flow tracing and mitigation checks, producing severity-rated findings with fix suggestions
revalidate: Second AI review checking git history for prior fixes, reducing false positives by ~50%+
enrich: Attaches git committer information and ownership data
export: Outputs Markdown or JSON format reports

Baseline false positive rate is ~10-20%, further reduced after revalidation. Both Claude and Codex backends share the same prompt and JSON output schema, and can be mixed.

Runtime Resilience & Distributed Execution#

Idempotent design: Each stage reads/writes a consistent disk representation; re-runs merge new information rather than overwriting, with a purely append-only merge model
Incremental recovery: Re-running the same command after interruption automatically skips already-analyzed files and resumes from the checkpoint
Concurrency safety: Atomic file claiming via lockedByRunId, enabling conflict-free parallel multi-worker execution
Distributed execution: Fan-out to massive parallelism via Vercel Sandbox micro-VMs (Vercel itself has scaled to 1000+ concurrent sandboxes); in sandbox mode, API keys are injected externally and network egress is restricted to coding agent hosts

CLI Commands#

Command	Function
`scan`	Regex matching of candidate security-sensitive sites
`process`	AI deep investigation, producing findings + suggestions
`process --diff`	PR mode: scan only diff-changed files, suitable for CI gating
`triage`	Lightweight P0/P1/P2 classification (using cheaper models)
`revalidate`	Second review of existing findings
`enrich`	Add git committer info + ownership data
`report`	Single-project Markdown + JSON summary
`export`	Per-finding JSON or Markdown file directory
`metrics`	Cross-project statistics: severity, vulnerability types, true positive rate
`sandbox <cmd>`	Execute any command on Vercel Sandbox micro-VMs

Plugin System#

Six extension points: matchers, notifiers, agents, ownership, people, executor (defined in packages/core/src/plugin.ts), supporting custom scan matchers and notifiers.

AI Provider Configuration#

Priority from high to low:

Explicit environment variables: ANTHROPIC_AUTH_TOKEN + ANTHROPIC_BASE_URL or OpenAI equivalents
Vercel AI Gateway (recommended for production): AI_GATEWAY_API_KEY=vck_..., one key covering both Claude and Codex with default quotas tuned for high concurrency
Local fallback: Automatically uses logged-in claude/codex subscriptions

Quick Start#

npx deepsec init       # Create .deepsec/ directory
cd .deepsec
pnpm install
pnpm deepsec scan
pnpm deepsec process
pnpm deepsec revalidate   # Optional
pnpm deepsec export --format md-dir --out ./findings

Use Cases & Known Limitations#

Large monorepo full security audits (used in production by Vercel itself, Unkey, and dub.co)
PR-level security gating and CI integration
Hidden vulnerability mining in legacy code
Best suited for application and service codebases; library/framework projects require custom prompts and scanners, with unverified applicability
Single large-scale scan costs may reach thousands to tens of thousands of dollars, but no precise quantification formula is available
Distributed execution strongly depends on Vercel Sandbox; alternatives for non-Vercel infrastructure are not documented
npm package publication status is unconfirmed (GitHub Releases/Packages pages both show empty, possibly pre-release)

Multi-Stage Scanning Pipeline#

Runtime Resilience & Distributed Execution#

CLI Commands#

Plugin System#

AI Provider Configuration#

Quick Start#

Use Cases & Known Limitations#

Related Projects

Genkit

Gobii Platform

Semble

STAY UPDATED