DISCOVER THE FUTURE OF AI AGENTS

All Projects

22 projects

vllm-mlx

🧠

A vLLM-style inference server for Apple Silicon with a native MLX backend, exposing both OpenAI and Anthropic compatible APIs in a single process, featuring multimodal unified serving, continuous batching, paged KV cache, and SSD-tiered caching.

MultimodalLarge Language ModelsPython

Rapid-MLX

A local AI inference engine for Apple Silicon with OpenAI-compatible API, supporting multi-modal, tool calling, and smart cloud routing.

AI AgentsLarge Language ModelsModel Context Protocol

AutoRound

An advanced post-training quantization toolkit for LLMs and VLMs by Intel, leveraging SignRound optimization to support 2–4 bit weight quantization and automatic mixed-precision scheme generation across Intel CPU/GPU, NVIDIA GPU, and Habana Gaudi.

MultimodalLarge Language ModelsTransformers

vLLM-Omni

🧠

A fully disaggregated multimodal model inference and serving framework that extends vLLM to support any-to-any modality unified inference and high-performance deployment.

Deep LearningMultimodalFastAPI

mlx-openai-server

A high-performance OpenAI-compatible API server for MLX models on Apple Silicon, supporting text, vision, audio transcription, and image generation/editing.

Deep LearningLarge Language ModelsMultimodal

RLinf

Flexible and scalable reinforcement learning training infrastructure for embodied and agentic AI post-training, decoupling logical workflow composition from efficient physical execution via the M2Flow paradigm.

MultimodalAI AgentsReinforcement Learning

Roboflow Trackers

A plug-and-play multi-object tracking (MOT) Python library offering modular implementations of classic algorithms like SORT and ByteTrack. Features a detector-agnostic design compatible with any object detection model (YOLO, DETR, etc.), supporting video files, cameras, RTSP streams, and more. Provides unified CLI tools and Python API with built-in evaluation metrics (CLEAR, HOTA, Identity).

MultimodalDeep LearningSDK

MiniCPM-o

An end-to-side omnimodal LLM by Tsinghua THUNLP supporting vision, speech, and full-duplex multimodal live streaming, optimized for mobile deployment with performance rivaling Gemini 2.5 Flash.

Large Language ModelsMultimodalTransformers

Vision-Agents

An open-source framework by Stream for building vision AI agents that work with any model or video provider, leveraging Stream's edge network for ultra-low latency video experiences.

Agent & ToolingPythonPyTorch

my-neuro

A customizable AI desktop companion project with character settings, voice conversations, long-term memory capabilities, and sub-1-second response times. Integrates with Live2D models for visual presentation.

Agent & ToolingPythonElectron
Per page

Page 1 / 3 · 22 total

STAY UPDATED

Get the latest AI tools and trends delivered straight to your inbox. No spam, just intelligence.