DISCOVER THE FUTURE OF AI AGENTS

All Projects

25 projects

Skyvern

An AI Agent platform that automates browser workflows using Vision LLMs, extending Playwright with natural language commands for web automation, workflow orchestration, and structured data extraction.

Model & Inference FrameworkMultimodalAI Agents

Second Brain

A local-first agentic framework acting as a personal operating system, leveraging file intelligence, event-driven workflow automation, and LLMs for cross-modal task execution and multi-platform interaction.

OtherMultimodalRAG

Peekaboo

macOS GUI automation tool powered by AI vision — captures screenshots, detects UI elements via multi-provider LLMs, and executes clicks/types/scrolls through natural language or scripted workflows

Model & Inference FrameworkMultimodalModel Context Protocol

OpenMontage

The first open-source, agentic video production system with 12 structured pipelines and 52 production tools, enabling end-to-end video creation via natural language inside AI coding assistants.

Natural Language ProcessingMultimodalAI Agents

Sparrow

Production-ready structured data extraction system supporting Vision LLMs and pluggable workflow orchestration for invoices, bank statements, financial tables, and more.

Model & Inference FrameworkLarge Language ModelsMultimodal

NodeTool

A node-based visual AI workflow and LLM Agent builder with local model support and multimodal orchestration across desktop, web, CLI, and mobile.

Model & Inference FrameworkLarge Language ModelsMultimodal

OpenOmniBot

An on-device Android AI assistant powered by VLM, supporting local model inference and screen-level automated interaction.

Model & Inference FrameworkMultimodalModel Context Protocol

Ghost OS

Full computer-use system for AI agents on macOS, exposing 29 MCP tools for structured perception, visual grounding, synthetic input, and self-learning Recipe workflows.

Docs, Tutorials & ResourcesMultimodalModel Context Protocol

FireRed-OpenStoryline

An MCP-based AI video editing agent that enables full-pipeline video creation—from material search and script generation to final rendering—through natural language interaction.

MultimodalNatural Language ProcessingModel Context Protocol

RLinf

Flexible and scalable reinforcement learning training infrastructure for embodied and agentic AI post-training, decoupling logical workflow composition from efficient physical execution via the M2Flow paradigm.

MultimodalAI AgentsReinforcement Learning
Per page

Page 1 / 3 · 25 total

STAY UPDATED

Get the latest AI tools and trends delivered straight to your inbox. No spam, just intelligence.