PromptHub
✨An open-source, free all-in-one workspace for AI Prompt and Skill management, featuring versioned Prompt editing, multi-platform Skill distribution, multi-model parallel testing, and local-first data synchronization.
An open-source, free all-in-one workspace for AI Prompt and Skill management, featuring versioned Prompt editing, multi-platform Skill distribution, multi-model parallel testing, and local-first data synchronization.
A full-stack LLM development toolkit from NVIDIA covering synthetic data generation, multi-backend inference, model training, and 11-category benchmark evaluation, scaling from single GPU to tens-of-thousands-GPU Slurm clusters.
An RL training environment building library for LLMs, providing complete infrastructure from development and testing to scaled rollout collection, with built-in RLVR scenarios and tool-calling support.
The Open Source AI Engineering Platform for Agents, LLMs & Models, providing experiment tracking, model registry, LLM observability, evaluation, prompt optimization, and a unified AI gateway.
A self-learning vector database integrating GNN-driven search optimization, local LLM inference, Cypher graph queries, and a PostgreSQL vector extension, deployable from WASM embeddings to Raft-distributed clusters.
A Rust-based cross-platform CLI tool that right-sizes LLM models to your system's RAM, CPU, and GPU by detecting specs and recommending optimal models and quantization strategies. Covers 206 models from 57 providers.
A curated list of free LLM inference APIs, covering rate limits, model lists, and special requirements for major platforms like OpenRouter, Google AI Studio, Groq, and Cerebras. Ideal for developers in the prototyping phase.
A minimal, hackable experimental harness for training LLMs on a single GPU node, covering all stages from pretraining to a ChatGPT-like UI.
An open-source framework by Stream for building vision AI agents that work with any model or video provider, leveraging Stream's edge network for ultra-low latency video experiences.
AirLLM optimizes inference memory usage, enabling 70B large language models to run on a single 4GB GPU card without quantization, distillation, or pruning. It now also supports running 405B Llama3.1 models on 8GB VRAM.
Page 1 / 5 · 43 total
Get the latest AI tools and trends delivered straight to your inbox. No spam, just intelligence.