Files
analysis_claude_code/s05_todo_write/README.md
gui-yue 1baf1aca5a Follow up PR #265: refine chapters, diagrams, and add S20 (#283)
* feat: s01-s14 docs quality overhaul — tool pipeline, single-agent, knowledge & resilience

Rewrite code.py and README (zh/en/ja) for s01-s14, each chapter building
incrementally on the previous. Key fixes across chapters:

- s01-s04: agent loop, tool dispatch, permission pipeline, hooks
- s05-s08: todo write, subagent, skill loading, context compact
- s09-s11: memory system, system prompt assembly, error recovery
- s12-s14: task graph, background tasks, cron scheduler

All chapters CC source-verified. Code inherits fixes forward (PROMPT_SECTIONS,
json.dumps cache, real-state context, can_start dep protection, etc.).

* feat: s15-s19 docs quality overhaul — multi-agent platform: teams, protocols, autonomy, worktree, MCP tools

Rewrite code.py and README (zh/en/ja) for s15-s19, the multi-agent platform
chapters. Each chapter inherits all previous fixes and adds one mechanism:

- s15: agent teams (TeamCreate, teammate threads, shared task list)
- s16: team protocols (plan approval, shutdown handshake, consume_inbox)
- s17: autonomous agents (idle polling, auto-claim, consume_lead_inbox)
- s18: worktree isolation (git worktree, bind_task, cwd switching, safety)
- s19: MCP tools (MCPClient, normalize_mcp_name, assemble_tool_pool, no cache)

All appendix source code references verified against CC source. Config priority
corrected: claude.ai < plugin < user < project < local.

* fix: 5 regressions across s05-s19 — glob safety, todo validation, memory extraction, protocol types, dep crash

- s05-s09: glob results now filter with is_relative_to(WORKDIR) (inherited from s02)
- s06-s08: todo_write validates content/status required fields (inherited from s05)
- s09: extract_memories uses pre-compression snapshot instead of compacted messages
- s16: submit_plan docstring clarifies protocol-only (not code-level gate)
- s17-s19: match_response restores type mismatch validation (from s16)
- s17-s19: claim_task deps list handles missing dep files without crashing

* fix: s12 Todo V2 logic reversal, s14/s15 cron range validation, s18/s19 worktree name validation

- s12 README (zh/en/ja): fix Todo V2 direction — interactive defaults to Task,
  non-interactive/SDK defaults to TodoWrite. Fix env var name to
  CLAUDE_CODE_ENABLE_TASKS (not TODO_V2).
- s14/s15: add _validate_cron_field with per-field range checks (minute 0-59,
  hour 0-23, dom 1-31, month 1-12, dow 0-6), step > 0, range lo <= hi.
  Replace old try/except validation that only caught exceptions.
- s18/s19: add validate_worktree_name() to remove_worktree and keep_worktree,
  not just create_worktree.

* fix: align s16-s19 teaching tool consistency

* fix pr265 chapter diagrams

* Add comprehensive s20 harness chapter

* Fix chapter smoke test regressions

* Clarify README tutorial track transition

---------

Co-authored-by: Haoran <bill-billion@outlook.com>
2026-05-20 21:45:38 +08:00

157 lines
6.7 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# s05: TodoWrite — 没有计划的 Agent做着做着就偏了
[中文](README.md) · [English](README.en.md) · [日本語](README.ja.md)
s01 → s02 → s03 → s04 → `s05` → [s06](../s06_subagent/) → s07 → ... → s20
> *"没有计划的 agent 走哪算哪"* — 先列步骤再动手,长任务更不容易漏项。
>
> **Harness 层**: 规划 — 让 Agent 在动手之前先想清楚。
---
## 问题
给 Agent 一个复杂任务:"把所有 Python 文件改成 snake_case 命名,然后跑测试,修好失败。"
Agent 开始干活,改了 3 个文件,跑了个测试,发现 2 个失败,开始修。修着修着,它忘了最初是"改成 snake_case",测试失败把注意力全吸走了。
对话越长越严重:工具结果不断填满上下文,系统提示的影响力被稀释。一个 10 步重构,做完 1-3 步就开始即兴发挥,因为 4-10 步已经被挤出注意力了。
---
## 解决方案
![Todo Overview](images/todo-overview.svg)
保留上一章的最小 hook 结构,重点看新增的 `todo_write` 工具和 reminder 机制。`todo_write` 本身不做任何实际工作,不能读文件、不能跑命令,只是让 Agent 在动手之前先理清思路。
dispatch 机制不变,新工具仍然走 `TOOL_HANDLERS[block.name]` 分发。但为了演示 todo reminder循环里加了一个计数器连续 3 轮没调 `todo_write` 就注入一条提醒。
---
## 工作原理
**todo_write 工具**,接收一个带状态的列表,持久化到 `.tasks/current_todos.json`(教学版写盘以便观察),同时在终端显示进度:
```python
def run_todo_write(todos: list) -> str:
tasks_file = TASKS_DIR / "current_todos.json"
tasks_file.write_text(json.dumps(todos, indent=2, ensure_ascii=False))
lines = ["\n## Current Tasks"]
for t in todos:
icon = {"pending": " ", "in_progress": "", "completed": ""}[t["status"]]
lines.append(f" [{icon}] {t['content']}")
print("\n".join(lines))
return f"Updated {len(todos)} tasks"
```
工具定义和其他 5 个工具一起加入 dispatch map
```python
TOOLS = [
{"name": "bash", ...},
{"name": "read_file", ...},
{"name": "write_file", ...},
{"name": "edit_file", ...},
{"name": "glob", ...},
# s05: 新增一条
{"name": "todo_write", "description": "Create and manage a task list ...",
"input_schema": {
"type": "object",
"properties": {
"todos": {
"type": "array",
"items": {
"type": "object",
"properties": {
"content": {"type": "string"},
"status": {"type": "string", "enum": ["pending", "in_progress", "completed"]},
},
},
},
},
},
},
]
TOOL_HANDLERS["todo_write"] = run_todo_write
```
**Nag reminder**,模型连续 3 轮没调 `todo_write`自动注入一条提醒教学版机制CC 源码中没有这个固定轮数逻辑):
```python
if rounds_since_todo >= 3 and messages:
messages.append({
"role": "user",
"content": "<reminder>Update your todos.</reminder>",
})
rounds_since_todo = 0
```
Agent 收到任务后的典型流程:先调 `todo_write` 列出所有步骤(全 `pending`)→ 做一个步骤,改成 `in_progress` → 做完改成 `completed` → 看下一个 `pending` → 继续。连续 3 轮没有调用 `todo_write` 时,循环会在下一次 LLM 调用前追加一条 reminder。
**关键洞察**todo_write 不给 Agent 增加任何**执行能力**。它增加的是**规划能力**。
---
## 相对 s04 的变更
| 组件 | 之前 (s04) | 之后 (s05) |
|------|-----------|-----------|
| 工具数量 | 5 (bash, read, write, edit, glob) | 6 (+todo_write) |
| 规划能力 | 无 | 带状态的 TODO 列表 + nag reminder |
| SYSTEM 提示 | 通用提示 | 加入 "先计划再执行" 引导 |
| 循环 | 不变 | dispatch 不变,新增 rounds_since_todo 计数器和 reminder 注入 |
---
## 试一下
```sh
cd learn-claude-code
python s05_todo_write/code.py
```
试试这些 prompt
1. `Refactor s05_todo_write/example/hello.py: add type hints, docstrings, and a main guard`(先列 3 步再执行)
2. `Create a Python package under s05_todo_write/example/demo_pkg with __init__.py, utils.py, and tests/test_utils.py`
3. `Review Python files under s05_todo_write/example and fix any style issues`
观察重点:第一次工具调用是不是 `todo_write`TODO 列了几步?执行过程中状态有没有从 `pending` 变成 `in_progress` / `completed`
---
## 接下来
Agent 能计划了。但如果一个任务太大,比如"重构整个认证模块",光靠 TODO 列表不够。这个任务本身就是几十个小任务的集合,放在同一个对话里会被上下文淹没。
s06 Subagent → 把大任务拆成子任务,每个子任务派一个独立的 Agent。它们有自己的干净上下文不会互相污染。
<details>
<summary>深入 CC 源码</summary>
CC 中有两套任务系统并存(`tasks.ts:133-139`
- **TodoWriteV1**:一个简单的列表工具,数据在内存 AppState 中维护(`TodoWriteTool.ts:65-103`)。教学版写盘到 `.tasks/current_todos.json` 是为了可观察性,真实 V1 不写盘
- **Task SystemV2 = s12**文件持久化、依赖图、并发锁、ownership
切换由 `isTodoV2Enabled()` 控制。当前源码的实现逻辑:交互式会话中 V2 默认启用非交互式会话SDK中 V1 默认启用;设置 `CLAUDE_CODE_ENABLE_TASKS` 环境变量可强制启用 V2。注意源码注释 "Force-enable tasks in non-interactive mode" 描述的是 env var 路径的用途,和默认分支的返回值语义不同,阅读时需区分。
教学版省略了真实源码中的 `activeForm` 字段(`utils/todo/types.ts:8-15`。CC 用它给 UI spinner 展示"正在做什么",教学版只有终端输出,不需要这个字段。
教学版的 nag reminder3 轮未更新就注入提醒是教学机制。CC 源码中没有固定的"3 轮"逻辑,更接近的是 `TodoWriteTool.ts:72-107` 中当 3 个以上 todo 全部完成但没有 verification 项时,追加 verification nudge。
Task System 相比 TodoWrite 的核心增量:
- 文件持久化Claude 配置目录下 `tasks/{taskListId}/{taskId}.json`)而非内存列表
- `blockedBy` 依赖图而非平铺列表
- `proper-lockfile` 并发安全而非无锁
- 四个独立工具Create/Get/Update/List而非一个
- TaskCreated / TaskCompleted hooks`TaskCreateTool.ts:80-129``TaskUpdateTool.ts:231-260`)供外部系统集成
</details>
<!-- translation-sync: zh@v1, en@v0, ja@v0 -->