Agent Park - Agent Project Navigator

All Projects

12 projects

verl

🧠

A flexible, efficient, and production-ready post-training reinforcement learning framework for LLMs

OtherDeep LearningMultimodal

VIEW DETAILS →

vLLM-Omni

🧠

A fully disaggregated multimodal model inference and serving framework that extends vLLM to support any-to-any modality unified inference and high-performance deployment.

Deep LearningMultimodalFastAPI

VIEW DETAILS →

mlx-openai-server

✨

A high-performance OpenAI-compatible API server for MLX models on Apple Silicon, supporting text, vision, audio transcription, and image generation/editing.

Deep LearningLarge Language ModelsMultimodal

VIEW DETAILS →

Roboflow Trackers

✨

A plug-and-play multi-object tracking (MOT) Python library offering modular implementations of classic algorithms like SORT and ByteTrack. Features a detector-agnostic design compatible with any object detection model (YOLO, DETR, etc.), supporting video files, cameras, RTSP streams, and more. Provides unified CLI tools and Python API with built-in evaluation metrics (CLEAR, HOTA, Identity).

MultimodalDeep LearningSDK

VIEW DETAILS →

WiFi DensePose

✨

A production-ready implementation of InvisPose that enables real-time, camera-free full-body tracking through walls using commodity WiFi mesh routers and CSI signals, with advanced analytics like fall detection and multi-person tracking.

MultimodalDeep LearningDocker

VIEW DETAILS →

VibeVoice

✨

Microsoft's family of open-source frontier voice AI models including both Text-to-Speech (TTS) and Automatic Speech Recognition (ASR) models, designed for long-form audio processing with multilingual support.

Model & Inference FrameworkPyTorchPython

VIEW DETAILS →

Speech-AI-Forge

✨

A project focused on TTS generation models, providing an API server and Gradio-based WebUI with support for multiple voice synthesis, voice cloning, and audio enhancement capabilities.

Model & Inference FrameworkPythonGradio

VIEW DETAILS →

Embodied_AI_Paper_List

✨

A curated list of embodied AI research papers maintained by the Human Communication and Perception Laboratory at SYSU, providing researchers with the latest academic findings in the embodied intelligence field.

Docs, Tutorials & ResourcesPythonMultimodal

VIEW DETAILS →

DeepVideoDiscovery

✨

A video content discovery tool developed by Microsoft that uses deep learning technology to automatically identify and extract key content from videos, helping users efficiently browse and understand video information。

Agent & ToolingPythonPyTorch

VIEW DETAILS →

LLaVA-Plus

✨

LLaVA-Plus is a multimodal assistant system that learns to use tools, combining large language models with visual capabilities to enable AI agents to perform general vision tasks.

Model & Inference FrameworkPythonPyTorch

VIEW DETAILS →

Per page

Page 1 / 2 · 12 total

Browse by Filters

Project Type

Filter by Domain

Filter by Product Form

All Projects

verl

vLLM-Omni

mlx-openai-server

Roboflow Trackers

WiFi DensePose

VibeVoice

Speech-AI-Forge

Embodied_AI_Paper_List

DeepVideoDiscovery

LLaVA-Plus

STAY UPDATED