Files
analysis_claude_code/s05_todo_write/README.md
gui-yue 1baf1aca5a Follow up PR #265: refine chapters, diagrams, and add S20 (#283)
* feat: s01-s14 docs quality overhaul — tool pipeline, single-agent, knowledge & resilience

Rewrite code.py and README (zh/en/ja) for s01-s14, each chapter building
incrementally on the previous. Key fixes across chapters:

- s01-s04: agent loop, tool dispatch, permission pipeline, hooks
- s05-s08: todo write, subagent, skill loading, context compact
- s09-s11: memory system, system prompt assembly, error recovery
- s12-s14: task graph, background tasks, cron scheduler

All chapters CC source-verified. Code inherits fixes forward (PROMPT_SECTIONS,
json.dumps cache, real-state context, can_start dep protection, etc.).

* feat: s15-s19 docs quality overhaul — multi-agent platform: teams, protocols, autonomy, worktree, MCP tools

Rewrite code.py and README (zh/en/ja) for s15-s19, the multi-agent platform
chapters. Each chapter inherits all previous fixes and adds one mechanism:

- s15: agent teams (TeamCreate, teammate threads, shared task list)
- s16: team protocols (plan approval, shutdown handshake, consume_inbox)
- s17: autonomous agents (idle polling, auto-claim, consume_lead_inbox)
- s18: worktree isolation (git worktree, bind_task, cwd switching, safety)
- s19: MCP tools (MCPClient, normalize_mcp_name, assemble_tool_pool, no cache)

All appendix source code references verified against CC source. Config priority
corrected: claude.ai < plugin < user < project < local.

* fix: 5 regressions across s05-s19 — glob safety, todo validation, memory extraction, protocol types, dep crash

- s05-s09: glob results now filter with is_relative_to(WORKDIR) (inherited from s02)
- s06-s08: todo_write validates content/status required fields (inherited from s05)
- s09: extract_memories uses pre-compression snapshot instead of compacted messages
- s16: submit_plan docstring clarifies protocol-only (not code-level gate)
- s17-s19: match_response restores type mismatch validation (from s16)
- s17-s19: claim_task deps list handles missing dep files without crashing

* fix: s12 Todo V2 logic reversal, s14/s15 cron range validation, s18/s19 worktree name validation

- s12 README (zh/en/ja): fix Todo V2 direction — interactive defaults to Task,
  non-interactive/SDK defaults to TodoWrite. Fix env var name to
  CLAUDE_CODE_ENABLE_TASKS (not TODO_V2).
- s14/s15: add _validate_cron_field with per-field range checks (minute 0-59,
  hour 0-23, dom 1-31, month 1-12, dow 0-6), step > 0, range lo <= hi.
  Replace old try/except validation that only caught exceptions.
- s18/s19: add validate_worktree_name() to remove_worktree and keep_worktree,
  not just create_worktree.

* fix: align s16-s19 teaching tool consistency

* fix pr265 chapter diagrams

* Add comprehensive s20 harness chapter

* Fix chapter smoke test regressions

* Clarify README tutorial track transition

---------

Co-authored-by: Haoran <bill-billion@outlook.com>
2026-05-20 21:45:38 +08:00

6.7 KiB
Raw Blame History

s05: TodoWrite — 没有计划的 Agent做着做着就偏了

中文 · English · 日本語

s01 → s02 → s03 → s04 → s05s06 → s07 → ... → s20

"没有计划的 agent 走哪算哪" — 先列步骤再动手,长任务更不容易漏项。

Harness 层: 规划 — 让 Agent 在动手之前先想清楚。


问题

给 Agent 一个复杂任务:"把所有 Python 文件改成 snake_case 命名,然后跑测试,修好失败。"

Agent 开始干活,改了 3 个文件,跑了个测试,发现 2 个失败,开始修。修着修着,它忘了最初是"改成 snake_case",测试失败把注意力全吸走了。

对话越长越严重:工具结果不断填满上下文,系统提示的影响力被稀释。一个 10 步重构,做完 1-3 步就开始即兴发挥,因为 4-10 步已经被挤出注意力了。


解决方案

Todo Overview

保留上一章的最小 hook 结构,重点看新增的 todo_write 工具和 reminder 机制。todo_write 本身不做任何实际工作,不能读文件、不能跑命令,只是让 Agent 在动手之前先理清思路。

dispatch 机制不变,新工具仍然走 TOOL_HANDLERS[block.name] 分发。但为了演示 todo reminder循环里加了一个计数器连续 3 轮没调 todo_write 就注入一条提醒。


工作原理

todo_write 工具,接收一个带状态的列表,持久化到 .tasks/current_todos.json(教学版写盘以便观察),同时在终端显示进度:

def run_todo_write(todos: list) -> str:
    tasks_file = TASKS_DIR / "current_todos.json"
    tasks_file.write_text(json.dumps(todos, indent=2, ensure_ascii=False))

    lines = ["\n## Current Tasks"]
    for t in todos:
        icon = {"pending": " ", "in_progress": "▸", "completed": "✓"}[t["status"]]
        lines.append(f"  [{icon}] {t['content']}")
    print("\n".join(lines))
    return f"Updated {len(todos)} tasks"

工具定义和其他 5 个工具一起加入 dispatch map

TOOLS = [
    {"name": "bash",       ...},
    {"name": "read_file",  ...},
    {"name": "write_file", ...},
    {"name": "edit_file",  ...},
    {"name": "glob",       ...},
    # s05: 新增一条
    {"name": "todo_write", "description": "Create and manage a task list ...",
     "input_schema": {
         "type": "object",
         "properties": {
             "todos": {
                 "type": "array",
                 "items": {
                     "type": "object",
                     "properties": {
                         "content": {"type": "string"},
                         "status": {"type": "string", "enum": ["pending", "in_progress", "completed"]},
                     },
                 },
             },
         },
     },
    },
]

TOOL_HANDLERS["todo_write"] = run_todo_write

Nag reminder,模型连续 3 轮没调 todo_write自动注入一条提醒教学版机制CC 源码中没有这个固定轮数逻辑):

if rounds_since_todo >= 3 and messages:
    messages.append({
        "role": "user",
        "content": "<reminder>Update your todos.</reminder>",
    })
    rounds_since_todo = 0

Agent 收到任务后的典型流程:先调 todo_write 列出所有步骤(全 pending)→ 做一个步骤,改成 in_progress → 做完改成 completed → 看下一个 pending → 继续。连续 3 轮没有调用 todo_write 时,循环会在下一次 LLM 调用前追加一条 reminder。

关键洞察todo_write 不给 Agent 增加任何执行能力。它增加的是规划能力


相对 s04 的变更

组件 之前 (s04) 之后 (s05)
工具数量 5 (bash, read, write, edit, glob) 6 (+todo_write)
规划能力 带状态的 TODO 列表 + nag reminder
SYSTEM 提示 通用提示 加入 "先计划再执行" 引导
循环 不变 dispatch 不变,新增 rounds_since_todo 计数器和 reminder 注入

试一下

cd learn-claude-code
python s05_todo_write/code.py

试试这些 prompt

  1. Refactor s05_todo_write/example/hello.py: add type hints, docstrings, and a main guard(先列 3 步再执行)
  2. Create a Python package under s05_todo_write/example/demo_pkg with __init__.py, utils.py, and tests/test_utils.py
  3. Review Python files under s05_todo_write/example and fix any style issues

观察重点:第一次工具调用是不是 todo_writeTODO 列了几步?执行过程中状态有没有从 pending 变成 in_progress / completed


接下来

Agent 能计划了。但如果一个任务太大,比如"重构整个认证模块",光靠 TODO 列表不够。这个任务本身就是几十个小任务的集合,放在同一个对话里会被上下文淹没。

s06 Subagent → 把大任务拆成子任务,每个子任务派一个独立的 Agent。它们有自己的干净上下文不会互相污染。

深入 CC 源码

CC 中有两套任务系统并存(tasks.ts:133-139

  • TodoWriteV1:一个简单的列表工具,数据在内存 AppState 中维护(TodoWriteTool.ts:65-103)。教学版写盘到 .tasks/current_todos.json 是为了可观察性,真实 V1 不写盘
  • Task SystemV2 = s12文件持久化、依赖图、并发锁、ownership

切换由 isTodoV2Enabled() 控制。当前源码的实现逻辑:交互式会话中 V2 默认启用非交互式会话SDK中 V1 默认启用;设置 CLAUDE_CODE_ENABLE_TASKS 环境变量可强制启用 V2。注意源码注释 "Force-enable tasks in non-interactive mode" 描述的是 env var 路径的用途,和默认分支的返回值语义不同,阅读时需区分。

教学版省略了真实源码中的 activeForm 字段(utils/todo/types.ts:8-15。CC 用它给 UI spinner 展示"正在做什么",教学版只有终端输出,不需要这个字段。

教学版的 nag reminder3 轮未更新就注入提醒是教学机制。CC 源码中没有固定的"3 轮"逻辑,更接近的是 TodoWriteTool.ts:72-107 中当 3 个以上 todo 全部完成但没有 verification 项时,追加 verification nudge。

Task System 相比 TodoWrite 的核心增量:

  • 文件持久化Claude 配置目录下 tasks/{taskListId}/{taskId}.json)而非内存列表
  • blockedBy 依赖图而非平铺列表
  • proper-lockfile 并发安全而非无锁
  • 四个独立工具Create/Get/Update/List而非一个
  • TaskCreated / TaskCompleted hooksTaskCreateTool.ts:80-129TaskUpdateTool.ts:231-260)供外部系统集成