We're cooking something extraordinary

EvoScientist

Towards Self-Evolving AI Scientists for
End-to-End Scientific Discovery

v0.0.1-beta MIT License
$ pip install EvoScientist
Agent Swarm Protocol

One Queen orchestrates.
Workers execute in parallel.

Spawn multiple AI agents in tmux split panes — each visible, each independent. The Queen coordinates tasks while Workers research, code, and debug simultaneously.

  • Pure tmux — no scripts, no daemons, no extra dependencies
  • Real-time observation of every agent in one terminal
  • Dynamic scaling — spawn or kill workers on demand
  • Compatible with Claude Code, Codex, Gemini, and more
tmux — swarm
Queen 👑
$ tmux new -s swarm
$ QUEEN=$(tmux display-message -p "#{pane_id}")
$ W1=$(tmux split-window -P -F "#{pane_id}")
$ W2=$(tmux split-window -P -F "#{pane_id}")
 
[@Queen👑] Assigning tasks...
[@researcher🐝] ready
[@coder🐝] ready
 
[@Queen👑] All workers active
researcher 🐝
[@Queen👑] Research SOTA for protein folding
Searching literature...
Found 12 relevant papers
Writing summary to ./swarm-output/
[@researcher🐝] Task complete
coder 🐝
[@Queen👑] Implement baseline model
$ python train.py --epochs 50
Epoch 50/50 ━━━━━━━━━━ 94.2%
[@coder🐝] Model saved
System Design

Agent pipeline in action

From user input to experimental output — every step orchestrated, every result verified.

User CLI / API Main Agent planner-agent research-agent code-agent debug-agent data-analysis-agent writing-agent Results
Specialized Agents

Purpose-built for every phase
of the scientific process

From hypothesis to publication — each agent handles a dedicated stage of the experiment workflow.

Experiment Planning

6-phase workflow from intake to verification. Baseline-first design with one-variable iteration for scientific rigor.

Literature Research

Deep web search with structured reflection. Finds papers, methods, and baselines with enforced citation rigor.

Code Generation & Debug

Write, execute, and iteratively debug experiment code in a sandboxed workspace with 300s timeout and output limits.

Data Analysis

Compute metrics, generate visualizations, and interpret results with statistical rigor and reproducibility.

Report Writing

Draft structured experimental reports and documentation with proper methodology sections and result summaries.

Multi-Agent Orchestration

Coordinate up to 3 concurrent sub-agents with automatic task routing. Built on LangGraph for reliable state management.

Multi-Provider

Your models, your choice

Plug in any major LLM provider. Auto-detect model names or specify full IDs directly.

A Anthropic
claude-opus-4-6claude-sonnet-4-5claude-haiku-4-5
O OpenAI
gpt-4oo1o1-mini
G Google
gemini-3-progemini-2.5-progemini-2.5-flash
N NVIDIA
deepseek-v3.1nemotron-nanoglm4.7

Built by the EvoScientist Team

Stay tuned.
Big things are coming.

Benchmarks · Web Interface · More agents

EvoScientist.ai@gmail.com