DISCOVER THE FUTURE OF AI AGENTS

All Projects

17 projects

Peekaboo

macOS GUI automation tool powered by AI vision — captures screenshots, detects UI elements via multi-provider LLMs, and executes clicks/types/scrolls through natural language or scripted workflows

Model & Inference FrameworkMultimodalModel Context Protocol

vllm-mlx

🧠

A vLLM-style inference server for Apple Silicon with a native MLX backend, exposing both OpenAI and Anthropic compatible APIs in a single process, featuring multimodal unified serving, continuous batching, paged KV cache, and SSD-tiered caching.

MultimodalLarge Language ModelsPython

OpenMontage

The first open-source, agentic video production system with 12 structured pipelines and 52 production tools, enabling end-to-end video creation via natural language inside AI coding assistants.

Natural Language ProcessingMultimodalAI Agents

Rapid-MLX

A local AI inference engine for Apple Silicon with OpenAI-compatible API, supporting multi-modal, tool calling, and smart cloud routing.

AI AgentsLarge Language ModelsModel Context Protocol

npcpy

A Python library providing key functional primitives for research in multimodal language models, agentic AI, and knowledge graphs, featuring unified model invocation, multi-agent collaboration and debate, knowledge graph lifecycle management, and multimodal generation.

Model & Inference FrameworkLarge Language ModelsMultimodal

NodeTool

A node-based visual AI workflow and LLM Agent builder with local model support and multimodal orchestration across desktop, web, CLI, and mobile.

Model & Inference FrameworkLarge Language ModelsMultimodal

AutoRound

An advanced post-training quantization toolkit for LLMs and VLMs by Intel, leveraging SignRound optimization to support 2–4 bit weight quantization and automatic mixed-precision scheme generation across Intel CPU/GPU, NVIDIA GPU, and Habana Gaudi.

MultimodalLarge Language ModelsTransformers

mlx-openai-server

A high-performance OpenAI-compatible API server for MLX models on Apple Silicon, supporting text, vision, audio transcription, and image generation/editing.

Deep LearningLarge Language ModelsMultimodal

RCLI

A fully on-device voice AI assistant for macOS Apple Silicon, integrating STT, LLM, TTS, VLM, RAG, and system control with zero cloud dependency.

Model & Inference FrameworkLarge Language ModelsMultimodal

Roboflow Trackers

A plug-and-play multi-object tracking (MOT) Python library offering modular implementations of classic algorithms like SORT and ByteTrack. Features a detector-agnostic design compatible with any object detection model (YOLO, DETR, etc.), supporting video files, cameras, RTSP streams, and more. Provides unified CLI tools and Python API with built-in evaluation metrics (CLEAR, HOTA, Identity).

MultimodalDeep LearningSDK
Per page

Page 1 / 2 · 17 total

STAY UPDATED

Get the latest AI tools and trends delivered straight to your inbox. No spam, just intelligence.