mirror of
https://github.com/shareAI-lab/analysis_claude_code.git
synced 2026-06-21 04:33:36 +08:00
* feat: s01-s14 docs quality overhaul — tool pipeline, single-agent, knowledge & resilience Rewrite code.py and README (zh/en/ja) for s01-s14, each chapter building incrementally on the previous. Key fixes across chapters: - s01-s04: agent loop, tool dispatch, permission pipeline, hooks - s05-s08: todo write, subagent, skill loading, context compact - s09-s11: memory system, system prompt assembly, error recovery - s12-s14: task graph, background tasks, cron scheduler All chapters CC source-verified. Code inherits fixes forward (PROMPT_SECTIONS, json.dumps cache, real-state context, can_start dep protection, etc.). * feat: s15-s19 docs quality overhaul — multi-agent platform: teams, protocols, autonomy, worktree, MCP tools Rewrite code.py and README (zh/en/ja) for s15-s19, the multi-agent platform chapters. Each chapter inherits all previous fixes and adds one mechanism: - s15: agent teams (TeamCreate, teammate threads, shared task list) - s16: team protocols (plan approval, shutdown handshake, consume_inbox) - s17: autonomous agents (idle polling, auto-claim, consume_lead_inbox) - s18: worktree isolation (git worktree, bind_task, cwd switching, safety) - s19: MCP tools (MCPClient, normalize_mcp_name, assemble_tool_pool, no cache) All appendix source code references verified against CC source. Config priority corrected: claude.ai < plugin < user < project < local. * fix: 5 regressions across s05-s19 — glob safety, todo validation, memory extraction, protocol types, dep crash - s05-s09: glob results now filter with is_relative_to(WORKDIR) (inherited from s02) - s06-s08: todo_write validates content/status required fields (inherited from s05) - s09: extract_memories uses pre-compression snapshot instead of compacted messages - s16: submit_plan docstring clarifies protocol-only (not code-level gate) - s17-s19: match_response restores type mismatch validation (from s16) - s17-s19: claim_task deps list handles missing dep files without crashing * fix: s12 Todo V2 logic reversal, s14/s15 cron range validation, s18/s19 worktree name validation - s12 README (zh/en/ja): fix Todo V2 direction — interactive defaults to Task, non-interactive/SDK defaults to TodoWrite. Fix env var name to CLAUDE_CODE_ENABLE_TASKS (not TODO_V2). - s14/s15: add _validate_cron_field with per-field range checks (minute 0-59, hour 0-23, dom 1-31, month 1-12, dow 0-6), step > 0, range lo <= hi. Replace old try/except validation that only caught exceptions. - s18/s19: add validate_worktree_name() to remove_worktree and keep_worktree, not just create_worktree. * fix: align s16-s19 teaching tool consistency * fix pr265 chapter diagrams * Add comprehensive s20 harness chapter * Fix chapter smoke test regressions * Clarify README tutorial track transition --------- Co-authored-by: Haoran <bill-billion@outlook.com>
157 lines
7.1 KiB
Markdown
157 lines
7.1 KiB
Markdown
# s05: TodoWrite — An Agent Without a Plan Drifts Off Course
|
|
|
|
[中文](README.md) · [English](README.en.md) · [日本語](README.ja.md)
|
|
|
|
s01 → s02 → s03 → s04 → `s05` → [s06](../s06_subagent/) → s07 → ... → s20
|
|
|
|
> *"An agent without a plan goes wherever the wind blows"* — List the steps first, then execute. Complex tasks are less likely to miss steps.
|
|
>
|
|
> **Harness Layer**: Planning — Let the Agent think before it acts.
|
|
|
|
---
|
|
|
|
## The Problem
|
|
|
|
Give the Agent a complex task: "Rename all Python files to snake_case, run tests, and fix failures."
|
|
|
|
The Agent starts working, renames 3 files, runs a test, finds 2 failures, starts fixing. While fixing, it forgets the original goal was "rename to snake_case", the test failures have consumed all its attention.
|
|
|
|
The longer the conversation, the worse it gets: tool results keep filling the context, diluting the system prompt's influence. A 10-step refactoring: after steps 1-3, the Agent starts improvising because steps 4-10 have been pushed out of its attention.
|
|
|
|
---
|
|
|
|
## The Solution
|
|
|
|

|
|
|
|
The minimal hook structure from the previous chapter is preserved, focusing on the new `todo_write` tool and reminder mechanism. `todo_write` does no actual work, can't read files or run commands, it simply lets the Agent organize its thoughts before diving in.
|
|
|
|
The dispatch mechanism is unchanged; the new tool is still routed through `TOOL_HANDLERS[block.name]`. However, to demonstrate the todo reminder, a counter was added to the loop: after 3 consecutive rounds without calling `todo_write`, a reminder is injected.
|
|
|
|
---
|
|
|
|
## How It Works
|
|
|
|
**The todo_write tool**, accepts a list with statuses, persists to `.tasks/current_todos.json` (teaching version writes to disk for observability), and displays progress in the terminal:
|
|
|
|
```python
|
|
def run_todo_write(todos: list) -> str:
|
|
tasks_file = TASKS_DIR / "current_todos.json"
|
|
tasks_file.write_text(json.dumps(todos, indent=2, ensure_ascii=False))
|
|
|
|
lines = ["\n## Current Tasks"]
|
|
for t in todos:
|
|
icon = {"pending": " ", "in_progress": "▸", "completed": "✓"}[t["status"]]
|
|
lines.append(f" [{icon}] {t['content']}")
|
|
print("\n".join(lines))
|
|
return f"Updated {len(todos)} tasks"
|
|
```
|
|
|
|
The tool definition joins the other 5 in the dispatch map:
|
|
|
|
```python
|
|
TOOLS = [
|
|
{"name": "bash", ...},
|
|
{"name": "read_file", ...},
|
|
{"name": "write_file", ...},
|
|
{"name": "edit_file", ...},
|
|
{"name": "glob", ...},
|
|
# s05: new entry
|
|
{"name": "todo_write", "description": "Create and manage a task list ...",
|
|
"input_schema": {
|
|
"type": "object",
|
|
"properties": {
|
|
"todos": {
|
|
"type": "array",
|
|
"items": {
|
|
"type": "object",
|
|
"properties": {
|
|
"content": {"type": "string"},
|
|
"status": {"type": "string", "enum": ["pending", "in_progress", "completed"]},
|
|
},
|
|
},
|
|
},
|
|
},
|
|
},
|
|
},
|
|
]
|
|
|
|
TOOL_HANDLERS["todo_write"] = run_todo_write
|
|
```
|
|
|
|
**Nag reminder**, when the model hasn't called `todo_write` for 3 consecutive rounds, a reminder is automatically injected (teaching mechanism; CC source has no fixed round-count logic):
|
|
|
|
```python
|
|
if rounds_since_todo >= 3 and messages:
|
|
messages.append({
|
|
"role": "user",
|
|
"content": "<reminder>Update your todos.</reminder>",
|
|
})
|
|
rounds_since_todo = 0
|
|
```
|
|
|
|
Typical flow when the Agent receives a task: first call `todo_write` to list all steps (all `pending`) → pick one step, set it to `in_progress` → complete it, set to `completed` → look at the next `pending` → continue. After 3 rounds without `todo_write`, the loop appends a reminder before the next LLM call.
|
|
|
|
**Key insight**: todo_write doesn't give the Agent any additional **execution capability**. What it adds is **planning capability**.
|
|
|
|
---
|
|
|
|
## Changes from s04
|
|
|
|
| Component | Before (s04) | After (s05) |
|
|
|-----------|-------------|-------------|
|
|
| Tool count | 5 (bash, read, write, edit, glob) | 6 (+todo_write) |
|
|
| Planning | None | Stateful TODO list + nag reminder |
|
|
| SYSTEM prompt | Generic prompt | Added "plan before executing" guidance |
|
|
| Loop | Unchanged | Dispatch unchanged, added rounds_since_todo counter and reminder injection |
|
|
|
|
---
|
|
|
|
## Try It
|
|
|
|
```sh
|
|
cd learn-claude-code
|
|
python s05_todo_write/code.py
|
|
```
|
|
|
|
Try these prompts:
|
|
|
|
1. `Refactor s05_todo_write/example/hello.py: add type hints, docstrings, and a main guard` (should list 3 steps first, then execute)
|
|
2. `Create a Python package under s05_todo_write/example/demo_pkg with __init__.py, utils.py, and tests/test_utils.py`
|
|
3. `Review Python files under s05_todo_write/example and fix any style issues`
|
|
|
|
What to watch for: Was the first tool call `todo_write`? How many TODO steps were listed? Did statuses move from `pending` to `in_progress` / `completed` during execution?
|
|
|
|
---
|
|
|
|
## What's Next
|
|
|
|
The Agent can plan now. But if a task is too large, say "refactor the entire auth module", a TODO list alone isn't enough. That task is itself a collection of dozens of subtasks that would drown in a single conversation's context.
|
|
|
|
→ s06 Subagent: Break large tasks into subtasks, each handled by an independent Agent with its own clean context, no cross-contamination.
|
|
|
|
<details>
|
|
<summary>Dive into CC Source Code</summary>
|
|
|
|
CC has two task systems coexisting (`tasks.ts:133-139`):
|
|
|
|
- **TodoWrite (V1)**: A simple list tool, data maintained in memory AppState (`TodoWriteTool.ts:65-103`). The teaching version writes to `.tasks/current_todos.json` for observability; the real V1 does not write to disk.
|
|
- **Task System (V2 = s12)**: File-persisted, dependency graph, concurrency locks, ownership.
|
|
|
|
The switch is controlled by `isTodoV2Enabled()`. In the current source: V2 is enabled by default in interactive sessions, V1 in non-interactive (SDK) sessions; setting `CLAUDE_CODE_ENABLE_TASKS` forces V2 regardless. Note the source comment "Force-enable tasks in non-interactive mode" describes the env var path's purpose, not the default branch's return semantics.
|
|
|
|
The teaching version omits the `activeForm` field from the real source (`utils/todo/types.ts:8-15`). CC uses it for the UI spinner to show "what's being done"; the teaching version only has terminal output and doesn't need this field.
|
|
|
|
The teaching version's nag reminder (3 rounds without update triggers injection) is an educational mechanism. The CC source has no fixed "3 rounds" logic; the closest is `TodoWriteTool.ts:72-107` which appends a verification nudge when 3+ todos are all completed without a verification item.
|
|
|
|
Core increments of the Task System over TodoWrite:
|
|
- File persistence (Claude config directory `tasks/{taskListId}/{taskId}.json`) instead of in-memory list
|
|
- `blockedBy` dependency graph instead of flat list
|
|
- `proper-lockfile` concurrency safety instead of no locking
|
|
- Four separate tools (Create/Get/Update/List) instead of one
|
|
- TaskCreated / TaskCompleted hooks (`TaskCreateTool.ts:80-129`, `TaskUpdateTool.ts:231-260`) for external system integration
|
|
|
|
</details>
|
|
|
|
<!-- translation-sync: zh@v1, en@v1, ja@v1 -->
|