Towards Self-Evolving AI Scientists for
End-to-End Scientific Discovery
Spawn multiple AI agents in tmux split panes — each visible, each independent. The Queen coordinates tasks while Workers research, code, and debug simultaneously.
From user input to experimental output — every step orchestrated, every result verified.
From hypothesis to publication — each agent handles a dedicated stage of the experiment workflow.
6-phase workflow from intake to verification. Baseline-first design with one-variable iteration for scientific rigor.
Deep web search with structured reflection. Finds papers, methods, and baselines with enforced citation rigor.
Write, execute, and iteratively debug experiment code in a sandboxed workspace with 300s timeout and output limits.
Compute metrics, generate visualizations, and interpret results with statistical rigor and reproducibility.
Draft structured experimental reports and documentation with proper methodology sections and result summaries.
Coordinate up to 3 concurrent sub-agents with automatic task routing. Built on LangGraph for reliable state management.
Plug in any major LLM provider. Auto-detect model names or specify full IDs directly.
Built by the EvoScientist Team
Benchmarks · Web Interface · More agents
EvoScientist.ai@gmail.com