diff --git a/README.md b/README.md index b4aefaa..da1ff8d 100644 --- a/README.md +++ b/README.md @@ -4,283 +4,114 @@ # PhD-Zero -### An Operating System for the Autonomous AI Scientist - -
- 简体中文 • - English • - What is PhD-Zero? • - Why it exists • - Skill Stack • - How it works • - Quick Start • - Roadmap • - Contributing • - Acknowledgements • - Cite +
An operating system for research-oriented coding agents.
+ ++ 简体中文 · + Website · + Quick Start · + Core Skills · + Contributing
-**Turn research workflows into reusable agent skills.** -**From literature search to experiment execution, memory, and paper writing.** - -*Built for Codex, Claude Code, and the future autonomous AI researcher.* - ---- - -## 👁️ What is PhD-Zero? - -**PhD-Zero** is an open-source **AI R&D operating layer** for coding agents. - -Instead of treating research as a single prompt, PhD-Zero breaks it into structured, reusable skills: -- plan the work, -- search for evidence, -- execute experiments, -- manage memory, -- request human review when needed, -- and turn results into research artifacts. - -The goal is simple: - -- **Near term:** build an **intern-level AI researcher** that can reliably handle scoped R&D tasks. -- **Long term:** build an **autonomous AI scientist** that can navigate the full algorithm-development lifecycle. - -In short, **PhD-Zero is the systems layer between raw model capability and real research execution.** - ---- - -## 🔥 Why it exists - -Today, strong models can already code, read papers, and debug scripts. - -What they still lack is **research discipline**. - -Real AI R&D is not just “generate an answer.” It requires: -- staged execution, -- evidence-backed reasoning, -- memory across steps, -- controlled experimentation, -- and human checkpoints for expensive or risky decisions. - -PhD-Zero exists to provide that missing structure. - -It helps agents move: -- **from vague ideas to executable plans** -- **from one-shot prompting to reusable workflows** -- **from hallucinated confidence to evidence-backed outputs** -- **from isolated tasks to compounding research memory** - ---- - -## 🛠️ Core skill stack - -PhD-Zero decomposes AI research into modular skills that agents can discover and invoke. - -| Skill | Role in the system | Human analogy | -| --- | --- | --- | -| `run-governor` | Controls stages, execution discipline, and run safety | PI / project lead | -| `research-workflow` | Default orchestration loop for non-trivial research tasks | Research manager | -| `research-plan` | Turns open-ended goals into concrete plans, ablations, and study designs | Senior researcher | -| `deep-research` | Collects external evidence, compares literature, and synthesizes findings | Literature reviewer | -| `experiment-execution` | Runs code, debugs failures, and executes experiments | Research engineer | -| `memory-manager` | Maintains working state and promotes reusable memory | Working + long-term memory | -| `project-context` | Persists project-specific runtime context and conventions | Lab notebook | -| `human-checkpoint` | Escalates risky, expensive, or high-impact decisions | Advisor / reviewer | -| `paper-writing` | Drafts and revises research artifacts and paper text | Scientific writer | - -> **Research is not one capability. It is a coordinated system of capabilities.** - ---- - -## ⚙️ How it works - -PhD-Zero is designed as a **shared skill layer** for different coding agents. - -### For Codex / GitHub Copilot-style agents -Repository-level behavior is defined through `AGENTS.md`. - -### For Claude Code -Skills are exposed through `.claude/skills/`. - -### Source of truth -The canonical skill definitions live in: - -```text -.agents/skills/ -``` - -This lets one repository drive multiple agent runtimes with the same research workflow logic. - ---- - -## 📂 Repository structure - -```text -. -├── AGENTS.md # Global operating rules for the workspace -├── REPO_CONVENTIONS.md # Artifact, logging, and repo hygiene rules -├── .agents/ -│ └── skills/ # Canonical skill definitions -├── .claude/ -│ └── skills/ # Claude Code discovery layer -├── .github/ -│ └── workflows/ # Repo automation -└── README.md -``` - ---- - -## ✨ What makes this different? - -Most “AI researcher” repos are one of these: - -* a benchmark, -* a paper list, -* a single-agent demo, -* or a prompt pack. - -PhD-Zero is different because it focuses on **operationalizing research**. - -It is not just trying to make an agent sound smart. -It is trying to make an agent **work like a research system**. - -That means: +PhD-Zero is a repository of reusable skills for AI research work. The point is not to make an agent sound smart for one turn. The point is to give it a workflow it can actually follow: plan the task, gather evidence, run experiments, keep context, ask for human review when needed, and write results down in a form another person can inspect. -* explicit execution stages, -* controlled memory usage, -* evidence-first decisions, -* reusable skills instead of ad hoc prompts, -* and compatibility with multiple coding-agent environments. +The same skill library is exposed to different runtimes. Codex-style agents read workspace rules from `AGENTS.md`. Claude Code sees a mirrored discovery layer under `.claude/skills/`. The actual source of truth lives in `.agents/skills/`. ---- +## Quick start -## 🚀 Quick start - -### 1. Clone the repository +If you just want to see whether the repo is wired correctly, do this: ```bash git clone https://github.com/TenureAI/PhD-Zero.git cd PhD-Zero -``` -### 2. Inspect the skill library - -```bash find .agents/skills -mindepth 1 -maxdepth 1 -type d find .claude/skills -mindepth 1 -maxdepth 1 -type l ``` -### 3. Connect your agent runtime - -* **Codex / Copilot-style agents** read workspace rules from `AGENTS.md` -* **Claude Code** discovers the same capabilities through `.claude/skills/` - -If both commands in step 2 list the same skill names, the shared skill layer is wired correctly. - -### 4. Start with a real research task - -Examples: +Those two commands should list the same skill names. If they do, the shared skill layer is in place. -* reproduce a paper result -* investigate why a training run failed -* design an ablation plan -* compare methods for a new project direction -* draft a research report from experiment evidence +From there: ---- +1. Read `AGENTS.md` to understand the workspace rules used by Codex-style agents. +2. Inspect `.agents/skills/` if you want the canonical skill implementations. +3. Inspect `.claude/skills/` if you want to verify the Claude Code mirror. -## 🧭 Roadmap +If you prefer a landing page over the raw repository view, there is also a static site under [docs/index.html](./docs/index.html). -PhD-Zero is the first layer of a larger vision. +## What is in this repository? -### Phase 1 — AI Research Intern +The repository is intentionally small. It does not try to be a benchmark suite, a framework, and a demo app all at once. It is mostly a skill library plus the rules that tell agents how to use it. -* scoped literature investigation -* experiment planning -* code execution and debugging -* report drafting - -### Phase 2 — AI Research Collaborator - -* reusable memory across projects -* stronger project context persistence -* better experiment iteration loops -* improved human-in-the-loop checkpoints - -### Phase 3 — Autonomous AI Scientist - -* independent hypothesis generation -* self-directed experimentation -* long-horizon project execution -* end-to-end algorithm development - ---- - -## 🤝 Contributing - -We are building this in the open. +```text +. +├── AGENTS.md +├── REPO_CONVENTIONS.md +├── .agents/skills/ # canonical skill definitions +├── .claude/skills/ # Claude Code mirror layer +├── .github/workflows/ # repository validation +├── assets/ # shared visual assets +└── docs/ # static landing page +``` -You can contribute by: +The CI in this repo checks that the skill directories under `.agents/skills` and `.claude/skills` stay in sync, and that every tracked skill has a readable `SKILL.md`. -1. adding new agent skills, -2. improving existing workflows, -3. contributing better evaluation tasks for AI R&D, -4. or using PhD-Zero in real research loops and sharing what breaks. +## Core skills -If you care about the future of **AI doing AI research**, this repo is for you. +The current skill set covers the basic loop of a research-oriented agent: ---- +| Skill | What it is for | +| --- | --- | +| `run-governor` | Stage control, run discipline, and execution policy | +| `research-workflow` | The default loop for non-trivial research tasks | +| `research-plan` | Turning an open-ended goal into a concrete plan | +| `deep-research` | External search, literature comparison, and synthesis | +| `experiment-execution` | Running code, debugging, and experiment execution | +| `memory-manager` | Working state and reusable memory | +| `project-context` | Project-specific runtime context and conventions | +| `human-checkpoint` | Human review for risky or expensive decisions | +| `paper-writing` | Drafting and revising research artifacts | -## 🙏 Acknowledgements +That list will probably grow, but the idea is stable: break research into pieces that can be reused instead of trying to solve everything with one giant prompt. -PhD-Zero builds on the broader agent, tooling, and writing-assistance ecosystem. +## Who this is for -We especially want to acknowledge: +PhD-Zero is for people who are already using coding agents in research or engineering-adjacent work and want more discipline around the process. If you care about literature review, experiment planning, reproducibility, or keeping an agent from improvising its way through a long task, this repo is meant to be useful. If you just want a flashy demo, it is probably not the right project. -- the contributors behind reusable coding-agent environments and workflow conventions -- the open-source communities building skill systems for Codex, Claude Code, and adjacent agent runtimes -- [blader/humanizer](https://github.com/blader/humanizer/tree/main), for practical text humanization patterns -- [op7418/Humanizer-zh](https://github.com/op7418/Humanizer-zh), for Chinese-focused humanization references +## Contributing -These projects are not dependencies of PhD-Zero, but they helped shape how we think about reusable agent behaviors and writing support. +Contributions are welcome, especially in three areas: ---- +1. new skills that fit the repository's scope +2. tighter workflows for the existing skills +3. validation and examples from real usage -## 💡 Philosophy +Before opening a PR, check `REPO_CONVENTIONS.md`. This repo keeps reusable skill content in version control and keeps task-specific logs or run artifacts out. -PhD-Zero is based on one belief: +## Acknowledgements -> **AI research should become programmable.** +PhD-Zero is shaped by the broader ecosystem around coding agents, research tooling, and writing support. In particular, the repository draws useful ideas from projects that treat workflows as first-class artifacts rather than one-off prompts. -Not just smarter outputs. -Not just better prompts. -But reusable, inspectable, evolving research workflows. +We also want to acknowledge: -That is the first step toward the autonomous AI scientist. +- [blader/humanizer](https://github.com/blader/humanizer/tree/main) +- [op7418/Humanizer-zh](https://github.com/op7418/Humanizer-zh) ---- +These are not runtime dependencies here, but they were useful references when thinking about writing quality and reusable editing guidance. -## 📚 Cite +## Cite -If PhD-Zero is useful in your workflow or research, please cite: +If PhD-Zero is useful in your workflow or research, you can cite it as: ```bibtex @misc{phd_zero_github, author = {TenureAI Contributors}, - title = {PhD-Zero: An Operating System for the Autonomous AI Scientist}, + title = {PhD-Zero: An Operating System for Research-Oriented Coding Agents}, year = {2026}, howpublished = {\url{https://github.com/TenureAI/PhD-Zero}}, note = {GitHub repository} } ``` - ---- - -- English • - 简体中文 • - PhD-Zero 是什么 • - 为什么要做它 • - Skill 栈 • - 工作方式 • - 快速开始 • - 路线图 • - 参与贡献 • - 致谢 • - 引用 +
一个面向研究型 coding agent 的工作流操作层。
+ ++ English · + Website · + 快速开始 · + 核心 Skills · + 参与贡献
-**把研究流程沉淀成可复用的 agent skills。** -**从文献搜索、实验执行、记忆管理,到论文写作。** - -*为 Codex、Claude Code,以及未来的 Autonomous AI Researcher 而构建。* - ---- - -## 👁️ PhD-Zero 是什么? - -**PhD-Zero** 是一个面向 coding agents 的开源 **AI R&D 操作层**。 - -它不把“研究”当成一次性 prompt,而是拆成一组结构化、可复用的 skills: - -- 规划任务 -- 搜索证据 -- 执行实验 -- 管理记忆 -- 在需要时请求人工审阅 -- 把结果整理成研究产物 - -目标很明确: - -- **近期:** 做出一个能稳定处理有边界 R&D 任务的 **AI Research Intern** -- **长期:** 做出一个能贯穿完整算法研发周期的 **Autonomous AI Scientist** - -一句话说,**PhD-Zero 是连接“原始模型能力”和“真实研究执行”的系统层。** - ---- - -## 🔥 为什么要做它? - -今天的强模型已经能写代码、读论文、调试脚本。 - -但它们普遍缺的是 **研究纪律**。 - -真实的 AI 研发不只是“生成一个答案”,它还要求: - -- 分阶段执行 -- 基于证据推理 -- 跨步骤记忆 -- 受控实验 -- 对昂贵或高风险决策设置人工检查点 - -PhD-Zero 就是为了补上这层结构。 - -它帮助 agent 完成这些转变: - -- **从模糊想法走向可执行计划** -- **从一次性 prompting 走向可复用 workflow** -- **从幻觉式自信走向证据驱动输出** -- **从孤立任务走向可积累研究记忆** - ---- - -## 🛠️ 核心 Skill 栈 - -PhD-Zero 把 AI research 拆成一组 agent 可以发现并调用的模块化 skills。 - -| Skill | 在系统中的角色 | 类比的人类角色 | -| --- | --- | --- | -| `run-governor` | 控制阶段、执行纪律和 run 安全策略 | PI / 项目负责人 | -| `research-workflow` | 非平凡研究任务的默认编排循环 | 研究经理 | -| `research-plan` | 把开放目标变成具体计划、ablation 和研究设计 | 资深研究员 | -| `deep-research` | 搜集外部证据、对比文献并综合结论 | 文献调研者 | -| `experiment-execution` | 跑代码、定位失败并执行实验 | 研究工程师 | -| `memory-manager` | 维护工作态与可复用记忆 | 工作记忆 + 长期记忆 | -| `project-context` | 持久化项目运行上下文与约定 | 实验记录本 | -| `human-checkpoint` | 升级处理高风险、高成本或高影响决策 | 导师 / 审稿人 | -| `paper-writing` | 起草和修改论文与研究文档 | 科学写作者 | - -> **研究不是单一能力,而是一组能力的协同系统。** - ---- - -## ⚙️ 它是怎么工作的? - -PhD-Zero 被设计成面向不同 coding agents 的 **共享 skill 层**。 - -### 对 Codex / GitHub Copilot 风格 agent - -仓库级行为由 `AGENTS.md` 定义。 - -### 对 Claude Code - -skills 通过 `.claude/skills/` 暴露。 - -### 单一事实源 - -真正的 skill 定义统一放在: - -```text -.agents/skills/ -``` - -这样同一个仓库就能为多种 agent runtime 提供同一套研究 workflow 逻辑。 - ---- +PhD-Zero 是一个可复用 research skills 仓库。它的目标不是让 agent 在单轮对话里“显得聪明”,而是给它一套真的能执行的研究工作流:怎么规划任务、怎么找证据、怎么跑实验、怎么保留上下文、什么时候该找人确认,以及最后怎么把结果写成别人能检查的研究产物。 -## 📂 仓库结构 +同一套 skills 会暴露给不同 runtime。Codex 风格 agent 主要通过 `AGENTS.md` 读取工作区规则,Claude Code 通过 `.claude/skills/` 发现镜像层,真正的 source of truth 统一放在 `.agents/skills/`。 -```text -. -├── AGENTS.md # 工作区的全局行为规范 -├── REPO_CONVENTIONS.md # 产物、日志与仓库卫生规则 -├── .agents/ -│ └── skills/ # 真实 skill 定义 -├── .claude/ -│ └── skills/ # Claude Code 的发现层 -├── .github/ -│ └── workflows/ # 仓库自动化 -├── assets/ # README 和其他文档资源 -└── README.md -``` - ---- - -## ✨ 它和其他仓库有什么不同? - -很多 “AI researcher” 仓库最后都会落到这几类之一: - -- benchmark -- 论文列表 -- 单 agent demo -- prompt 包 - -PhD-Zero 的区别在于,它关注的是 **把研究真正操作化**。 - -它不是只想让 agent “看起来聪明”,而是想让 agent **像一个研究系统一样工作**。 - -这意味着: +## 快速开始 -- 显式执行阶段 -- 受控的记忆使用 -- 证据优先的决策 -- 用可复用 skills 替代零散 prompt -- 同时兼容多种 coding-agent 环境 - ---- - -## 🚀 快速开始 - -### 1. 克隆仓库 +如果你只是想先确认这个仓库的 skill 层是通的,直接跑下面几条命令: ```bash git clone https://github.com/TenureAI/PhD-Zero.git cd PhD-Zero -``` - -### 2. 检查 skill 库 -```bash find .agents/skills -mindepth 1 -maxdepth 1 -type d find .claude/skills -mindepth 1 -maxdepth 1 -type l ``` -如果这两条命令列出的 skill 名称一致,说明共享 skill 层是连通的。 - -### 3. 连接你的 agent runtime - -- **Codex / Copilot 风格 agent** 从 `AGENTS.md` 读取工作区规则 -- **Claude Code** 通过 `.claude/skills/` 发现同一组能力 - -### 4. 用一个真实研究任务开始 +如果两条命令列出的 skill 名称一致,说明共享 skill 层已经接通。 -例如: +接下来通常这样看: -- 复现论文结果 -- 分析训练任务为什么失败 -- 设计 ablation 计划 -- 比较新项目方向里的多种方法 -- 根据实验结果起草研究报告 +1. 先读 `AGENTS.md`,了解 Codex 风格 agent 的工作区规则。 +2. 再看 `.agents/skills/`,这里是技能的真实实现。 +3. 如果你关心 Claude Code 的发现方式,再看 `.claude/skills/`。 ---- +如果你更喜欢首页式入口而不是直接翻仓库,也可以打开 [docs/index.html](./docs/index.html)。 -## 🧭 路线图 +## 仓库里有什么 -PhD-Zero 是一个更大愿景的第一层。 +这个仓库故意保持得比较克制。它不是 benchmark、framework、demo app 的混合体,主体就是 skill 库本身,加上一点规则和校验。 -### Phase 1 — AI Research Intern - -- 有边界的文献调研 -- 实验规划 -- 代码执行与调试 -- 报告起草 - -### Phase 2 — AI Research Collaborator - -- 跨项目可复用记忆 -- 更强的 project context 持久化 -- 更好的实验迭代循环 -- 更成熟的人在回路检查点 - -### Phase 3 — Autonomous AI Scientist - -- 独立提出假设 -- 自主驱动实验 -- 长时程项目执行 -- 端到端算法研发 - ---- - -## 🤝 参与贡献 - -我们正在公开构建它。 +```text +. +├── AGENTS.md +├── REPO_CONVENTIONS.md +├── .agents/skills/ # skill 的真实定义 +├── .claude/skills/ # Claude Code 的镜像发现层 +├── .github/workflows/ # 仓库校验 +├── assets/ # 共用视觉资源 +└── docs/ # 静态 landing page +``` -你可以这样参与: +这个仓库的 CI 主要检查两件事:`.agents/skills` 和 `.claude/skills` 的技能目录是否同步,以及每个已跟踪 skill 是否都有可解析的 `SKILL.md`。 -1. 添加新的 agent skills -2. 改进已有 workflow -3. 贡献更好的 AI R&D 评测任务 -4. 在真实研究循环里使用 PhD-Zero,并反馈哪里失效 +## 核心 Skills -如果你关心 **AI 做 AI 研究** 的未来,这个仓库就是给你的。 +现在这批 skills 覆盖的是一个研究型 agent 的基本闭环: ---- +| Skill | 用途 | +| --- | --- | +| `run-governor` | 管阶段、执行纪律和 run 策略 | +| `research-workflow` | 非平凡研究任务的默认执行循环 | +| `research-plan` | 把开放目标收敛成具体计划 | +| `deep-research` | 做外部搜索、文献比较和综合判断 | +| `experiment-execution` | 跑代码、调试和执行实验 | +| `memory-manager` | 管工作态和可复用记忆 | +| `project-context` | 保留项目级运行上下文和约定 | +| `human-checkpoint` | 在高风险或高成本节点找人确认 | +| `paper-writing` | 起草和修改研究产物 | -## 🙏 致谢 +这个列表以后还会扩,但基本思路不会变:把研究拆成可以复用的模块,而不是指望一个超长 prompt 包打天下。 -PhD-Zero 的构建受到了更广泛的 agent、工具链和写作辅助生态启发。 +## 这个仓库适合谁 -我们特别感谢: +如果你已经在研究或工程工作里使用 coding agents,并且开始在意流程纪律、实验可复现性、长任务中的上下文保持,或者想减少 agent 一路即兴发挥带来的风险,这个仓库大概率对你有用。如果你只是想看一个炫一点的 demo,它可能不是最合适的项目。 -- 为 coding-agent 环境和 workflow 规范做出贡献的开源社区 -- 构建 Codex、Claude Code 及相关 agent runtime 技能系统的贡献者 -- [blader/humanizer](https://github.com/blader/humanizer/tree/main),提供了实用的人类化文本改写思路 -- [op7418/Humanizer-zh](https://github.com/op7418/Humanizer-zh),提供了面向中文的人类化写作参考 +## 参与贡献 -这些项目并不是 PhD-Zero 的直接依赖,但它们影响了我们对“可复用 agent 行为”和“写作支持能力”的设计方式。 +欢迎贡献,尤其是下面三类: ---- +1. 新的 skill,但前提是它真的符合仓库范围 +2. 对现有 workflows 的收紧和改进 +3. 来自真实使用场景的验证和反馈 -## 💡 理念 +开 PR 之前建议先看一遍 `REPO_CONVENTIONS.md`。这个仓库只跟踪可复用 skill 内容,不把一次性的 run 日志或任务产物混进来。 -PhD-Zero 建立在一个核心判断上: +## 致谢 -> **AI research 应该变成可编程的。** +PhD-Zero 的形成受到了 coding agents、research tooling 和写作辅助生态的影响。尤其是那些把 workflow 当成一等对象,而不是把 prompt 当一次性脚本来用的项目,对这个仓库的方向影响很大。 -不只是更聪明的输出。 -不只是更好的 prompts。 -而是可复用、可检查、可演化的研究工作流。 +另外也感谢: -这是走向 autonomous AI scientist 的第一步。 +- [blader/humanizer](https://github.com/blader/humanizer/tree/main) +- [op7418/Humanizer-zh](https://github.com/op7418/Humanizer-zh) ---- +它们不是这里的运行时依赖,但在思考写作质量和可复用编辑规范时提供了很有价值的参考。 -## 📚 引用 +## 引用 如果 PhD-Zero 对你的工作流或研究有帮助,可以这样引用: ```bibtex @misc{phd_zero_github, author = {TenureAI Contributors}, - title = {PhD-Zero: An Operating System for the Autonomous AI Scientist}, + title = {PhD-Zero: An Operating System for Research-Oriented Coding Agents}, year = {2026}, howpublished = {\url{https://github.com/TenureAI/PhD-Zero}}, note = {GitHub repository} } ``` - ---- - -