BullshitBench
✨A benchmark measuring whether AI models challenge nonsensical prompts rather than confidently answering them, featuring 100 questions across 5 domains with a 3-tier judgment system and multi-judge panel.
A benchmark measuring whether AI models challenge nonsensical prompts rather than confidently answering them, featuring 100 questions across 5 domains with a 3-tier judgment system and multi-judge panel.
A local-first AI research assistant featuring multi-LLM support, 20+ research strategies, multi-search-engine integration, and automated quality scoring for 212K+ academic sources, producing citation-backed PDF/Markdown reports via CLI, Web UI, REST API, or MCP Server.
A curated collection of extracted system prompts from popular chatbots like ChatGPT, Claude, and Gemini, helping researchers understand the behavior boundaries of AI models。
ARGO is an open-source AI Agent platform that brings local Large Language Models to your desktop with one-click downloads, offline-first RAG knowledge bases, and multi-agent collaboration capabilities, keeping 100% of your data locally secure。
A comprehensive knowledge base containing ChatGPT jailbreaks, prompt leaks, prompt injections, super prompts, and AI security attack/defense resources, ideal for AI researchers and developers.
Page 1 / 1 · 5 total
Get the latest AI tools and trends delivered straight to your inbox. No spam, just intelligence.