A self-evolving multi-agent AI scientist framework for end-to-end scientific discovery, from idea generation to paper publication.
Overview#
EvoScientist is a multi-agent collaborative AI research automation framework covering the full scientific research lifecycle: Intake → Plan → Execute → Evaluate → Write → Verify. Its core philosophy enables AI agents to autonomously explore, generate insights, and iteratively improve, while accumulating research experience through persistent memory and self-evolution mechanisms.
Multi-Agent Collaboration#
- 6 sub-agents: planner-agent, research-agent, code-agent, debug-agent, data-analysis-agent, writing-agent, coordinated via shared LangGraph state machine
- Evolution Manager Agent (EMA): extracts reusable knowledge from historical interactions to continuously optimize research strategies
- The paper (arXiv:2603.08127) describes a 3-agent architecture (RA/EA/EMA); the code has expanded to 6 sub-agents, with exact mapping relationship to be confirmed
Persistent Memory System#
- Ideation Memory: summarizes viable research directions, records failed directions to avoid repeated exploration
- Experimentation Memory: captures effective data processing and model training strategies
- Context, preferences, and experimental findings persist across sessions (SQLite session database)
Scientific Workflow Engine#
- 6-stage pipeline using baseline-first + single-variable iteration design for scientific rigor
- Sandboxed code execution: 300s timeout, output limits, auto-recovery, "More Effort" iterative refinement mode
- In-depth literature research: web search + 7-dimensional structured reflection, enforced citation standards
Multi-Provider & Multi-Channel#
- 9 LLM providers: Anthropic, OpenAI, Google, NVIDIA, SiliconFlow, OpenRouter, Volcengine, DashScope, Ollama/Custom
- CLI/TUI as core hub, supporting 10 messaging platforms: Telegram, Slack, Feishu, WeChat, DingTalk, iMessage, Discord, Email, QQ, Signal — all sharing the same agent session
Adaptive Interaction#
- Dynamically rewrites system prompts based on conversation state
- Shows only relevant tools per turn to reduce noise
Plugins & Extensions#
- MCP Protocol: dynamically add MCP servers via
EvoSci mcp addcommand - EvoSkills: 10 pre-built research lifecycle skills (research-ideation, idea-tournament, experiment-pipeline, experiment-craft, paper-planning, paper-writing, paper-review, paper-rebuttal, academic-slides, evo-memory), compatible with Claude Code, Cursor and other third-party AI coding agents
Framework Dependencies#
- LangChain — agent framework foundation
- DeepAgents — batteries-included agent harness
- LangGraph — state machine orchestration and multi-agent coordination
- MCP (Model Context Protocol) — external tool integration protocol
- Runtime: Python 3.11+ (< 3.14); Docker image includes Python 3.11, Node.js 24 LTS + npx
Installation#
# Recommended: uv tool install
uv tool install EvoScientist
EvoSci onboard
# pip install
uv pip install EvoScientist
# Docker
docker run -it --rm --env-file .env -v "$(pwd)/workspace:/workspace" -v evosci-data:/home/evosci/.evoscientist ghcr.io/evoscientist/evoscientist:latest
Benchmark Performance & Academic Recognition#
- 🏆 ICAIS 2025 AI Scientist Track: 6/6 papers accepted, Best Paper & AI Reviewer's Appraisal Award
- 🥇 DeepResearch Bench II: Ranked #1
- 🥇 DeepResearch Bench: Ranked #1
- 🥇 AstaBench Code & Execution: Ranked #1
- 🥇 AstaBench Data Analysis: Ranked #1
Note: Specific benchmark scores are not publicly available; only rank positions at time of submission are stated.
Pending Information#
- Exact EvoSkills installation command not explicitly documented
- Mapping between paper's 3-agent and code's 6-agent architectures not detailed
- Memory module storage format and capacity limits not disclosed
- Web UI not currently available; listed as upcoming in roadmap