BullshitBench
✨A benchmark measuring whether AI models challenge nonsensical prompts rather than confidently answering them, featuring 100 questions across 5 domains with a 3-tier judgment system and multi-judge panel.
A benchmark measuring whether AI models challenge nonsensical prompts rather than confidently answering them, featuring 100 questions across 5 domains with a 3-tier judgment system and multi-judge panel.
A local-first personal AI agent framework from Stanford that enables offline agent orchestration, skill import, and trace-driven continuous learning through five composable primitives, supporting 10+ inference backends and four interaction modes.
A local-first AI research assistant featuring multi-LLM support, 20+ research strategies, multi-search-engine integration, and automated quality scoring for 212K+ academic sources, producing citation-backed PDF/Markdown reports via CLI, Web UI, REST API, or MCP Server.
The Swiss Army Knife of On-Device AI — an offline-first mobile AI suite covering text generation, image generation, vision Q&A, and speech-to-text, with on-device RAG knowledge base and tool calling support.
Nanobrowser is a lightweight browser tool designed to provide a simple and efficient web browsing experience.
ingraind is a security monitoring agent built around RedBPF for complex containerized environments and endpoints, using eBPF probes to provide safe and performant instrumentation for any Linux-based environment.
DongTai-agent-java is the data acquisition tool for DongTai IAST that collects method invocation data during runtime of Java applications through dynamic hooks, enabling security vulnerability detection and analysis.
Page 1 / 1 · 7 total
Get the latest AI tools and trends delivered straight to your inbox. No spam, just intelligence.