DISCOVER THE FUTURE OF AI AGENTS

Sparrow

Added May 4, 2026
Agent & Tooling
Open Source
PythonWorkflow AutomationLarge Language ModelsMultimodalAgent & ToolingModel & Inference FrameworkAutomation, Workflow & RPAComputer Vision & MultimodalFinance

Production-ready structured data extraction system supporting Vision LLMs and pluggable workflow orchestration for invoices, bank statements, financial tables, and more.

Overview#

Sparrow is a production-ready structured data extraction system developed by Katana ML (lead developer Andrej Baranovskij). It focuses on converting unstructured documents — invoices, receipts, bank statements, financial tables, and more — into structured JSON data with high precision and automation.

Architecture#

The project uses a monorepo multi-app architecture with three top-level directories:

  • sparrow-data/: Sample test data (bonds_table.png, bank_statement.pdf, invoice.pdf, etc.)
  • sparrow-ml/: Core computation engine, containing llm/ (main API server & CLI) and agents/ (workflow orchestration service)
  • sparrow-ui/: Gradio-based Web interactive interface

Pipeline Engines#

  • Sparrow Parse: Vision LLM-based visual structured extraction pipeline, supporting Mistral, Qwen 2.5-VL, DeepSeek OCR, dots.ocr, dots-mocr, and more
  • Sparrow Instructor: Text LLM-based instruction processing, validation, and decision pipeline (supporting GPT-OSS, Mistral, Qwen 3.5, etc.)
  • Sparrow Agents: Prefect-orchestrated multi-step processing pipeline that decomposes complex scenarios into subtasks like classification → extraction → validation

Inference Backends#

  • Apple Silicon: Deep MLX framework integration, leveraging unified memory for efficient large model execution (dependency package sparrow-parse[mlx])
  • NVIDIA/AMD GPU: Compatible with vLLM and Ollama backends, requiring CUDA environment
  • Cloud & CPU: Supports Hugging Face Cloud GPU backend, or local execution of ≤7B small models

Document Processing#

  • Supports PNG, JPG images and multi-page PDF input
  • Natively adapts to invoices, receipts, forms, bank statements, financial tables, and other document types
  • --crop-size parameter for document border cropping to improve Vision LLM focus accuracy
  • JSON Schema-driven output constraints with automatic validation

Usage#

CLI#

./sparrow.sh "<JSON_SCHEMA>" --pipeline "<PIPELINE>" [OPTIONS] --file-path "<FILE>"

Key parameters: --pipeline (select pipeline), --options (backend & model), --instruction (text instruction), --validation (field validation), --crop-size (crop pixels), --page-type (page classification).

RESTful API#

  • Document extraction: POST /api/v1/sparrow-llm/inference (multipart/form-data)
  • Text instruction: POST /api/v1/sparrow-llm/instruction-inference (form-urlencoded)
  • Access http://localhost:8002/api/v1/sparrow-llm/docs for Swagger interactive docs after startup

Agent API#

curl -X POST 'http://localhost:8001/api/v1/sparrow-agents/execute/file' \
  -F 'agent_name=medical_prescriptions' \
  -F 'extraction_params={"sparrow_key":"123456"}' \
  -F 'file=@prescription.pdf'

Use Cases#

  • Financial document automation: Extract full structured data from invoices and bank statements in PDF/image format
  • Financial table extraction: Extract fields like instrument_name and valuation from bond tables
  • Medical prescription processing: Multi-step workflow via Agent orchestration (classification → extraction → validation)
  • Text instruction processing: Math operations, text analysis, summarization, Q&A
  • Function Calling integration: External data access such as stock data queries

Enterprise Features#

  • API-First design with complete RESTful API and Swagger documentation
  • Prefect-based Dashboard and Agent workflow tracking (Dashboard requires local Oracle Database 23ai Free)
  • Built-in rate limiting and usage analytics
  • Open-sourced under GPL-3.0 with commercial dual-licensing available

Installation Requirements#

Python 3.12.10+, macOS/Linux/Windows, GPU (matching model VRAM requirements). PDF processing requires poppler.

Related Projects

View All

STAY UPDATED

Get the latest AI tools and trends delivered straight to your inbox. No spam, just intelligence.