mirror of
https://github.com/shareAI-lab/analysis_claude_code.git
synced 2026-06-20 20:23:36 +08:00
* feat: s01-s14 docs quality overhaul — tool pipeline, single-agent, knowledge & resilience Rewrite code.py and README (zh/en/ja) for s01-s14, each chapter building incrementally on the previous. Key fixes across chapters: - s01-s04: agent loop, tool dispatch, permission pipeline, hooks - s05-s08: todo write, subagent, skill loading, context compact - s09-s11: memory system, system prompt assembly, error recovery - s12-s14: task graph, background tasks, cron scheduler All chapters CC source-verified. Code inherits fixes forward (PROMPT_SECTIONS, json.dumps cache, real-state context, can_start dep protection, etc.). * feat: s15-s19 docs quality overhaul — multi-agent platform: teams, protocols, autonomy, worktree, MCP tools Rewrite code.py and README (zh/en/ja) for s15-s19, the multi-agent platform chapters. Each chapter inherits all previous fixes and adds one mechanism: - s15: agent teams (TeamCreate, teammate threads, shared task list) - s16: team protocols (plan approval, shutdown handshake, consume_inbox) - s17: autonomous agents (idle polling, auto-claim, consume_lead_inbox) - s18: worktree isolation (git worktree, bind_task, cwd switching, safety) - s19: MCP tools (MCPClient, normalize_mcp_name, assemble_tool_pool, no cache) All appendix source code references verified against CC source. Config priority corrected: claude.ai < plugin < user < project < local. * fix: 5 regressions across s05-s19 — glob safety, todo validation, memory extraction, protocol types, dep crash - s05-s09: glob results now filter with is_relative_to(WORKDIR) (inherited from s02) - s06-s08: todo_write validates content/status required fields (inherited from s05) - s09: extract_memories uses pre-compression snapshot instead of compacted messages - s16: submit_plan docstring clarifies protocol-only (not code-level gate) - s17-s19: match_response restores type mismatch validation (from s16) - s17-s19: claim_task deps list handles missing dep files without crashing * fix: s12 Todo V2 logic reversal, s14/s15 cron range validation, s18/s19 worktree name validation - s12 README (zh/en/ja): fix Todo V2 direction — interactive defaults to Task, non-interactive/SDK defaults to TodoWrite. Fix env var name to CLAUDE_CODE_ENABLE_TASKS (not TODO_V2). - s14/s15: add _validate_cron_field with per-field range checks (minute 0-59, hour 0-23, dom 1-31, month 1-12, dow 0-6), step > 0, range lo <= hi. Replace old try/except validation that only caught exceptions. - s18/s19: add validate_worktree_name() to remove_worktree and keep_worktree, not just create_worktree. * fix: align s16-s19 teaching tool consistency * fix pr265 chapter diagrams * Add comprehensive s20 harness chapter * Fix chapter smoke test regressions * Clarify README tutorial track transition --------- Co-authored-by: Haoran <bill-billion@outlook.com>
208 lines
7.5 KiB
Markdown
208 lines
7.5 KiB
Markdown
# s01: Agent Loop — 一个循环就够了
|
||
|
||
[中文](README.md) · [English](README.en.md) · [日本語](README.ja.md)
|
||
|
||
`s01` → [s02](../s02_tool_use/) → s03 → s04 → ... → s20
|
||
> *"One loop & Bash is all you need"* — 一个工具 + 一个循环 = 一个 Agent。
|
||
>
|
||
> **Harness 层**: 循环 — 模型与真实世界的第一道连接。
|
||
|
||
---
|
||
|
||
## 问题
|
||
|
||
你提出了一个问题给大模型:“帮我读取下我的目录下有哪些文件,并且执行XXX.py”。
|
||
|
||
模型能输出一条 bash 命令,但输出完了就停了,它不会自己跑,也不会看到结果后继续推理。
|
||
|
||
你可以手动跑一遍,把输出粘贴回对话框,让它接着干。下一个命令出来,你再跑一遍、再贴回去。
|
||
|
||
每一个来回,你都在做中间层。而把它自动化,就是这一章要做的事。
|
||
|
||
---
|
||
|
||
## 解决方案
|
||
|
||

|
||
|
||
一个 `while True` 循环,模型调用工具就继续,不调用就停。整个过程只有两个信号:
|
||
|
||
| 信号 | 含义 | 循环动作 |
|
||
|------|------|---------|
|
||
| `stop_reason == "tool_use"` | 模型举手说"我要用工具" | 执行 → 结果喂回去 → 继续 |
|
||
| `stop_reason != "tool_use"` | 模型说"我做完了" | 退出循环 |
|
||
|
||
---
|
||
|
||
## 工作原理
|
||
|
||
将这个过程翻译成代码。分步来看:
|
||
|
||
**第 1 步**:把用户的问题作为第一条消息。
|
||
|
||
```python
|
||
messages = [{"role": "user", "content": query}]
|
||
```
|
||
|
||
**第 2 步**:将消息和工具定义一起发给 LLM。
|
||
|
||
```python
|
||
response = client.messages.create(
|
||
model=MODEL, system=SYSTEM, messages=messages,
|
||
tools=TOOLS, max_tokens=8000,
|
||
)
|
||
```
|
||
|
||
**第 3 步**:追加模型回答,检查它是否调了工具。没调 → 结束。
|
||
|
||
```python
|
||
messages.append({"role": "assistant", "content": response.content})
|
||
if response.stop_reason != "tool_use":
|
||
return
|
||
```
|
||
|
||
**第 4 步**:执行模型要求的工具,收集结果。
|
||
|
||
```python
|
||
results = []
|
||
for block in response.content:
|
||
if block.type == "tool_use":
|
||
output = run_bash(block.input["command"])
|
||
results.append({
|
||
"type": "tool_result",
|
||
"tool_use_id": block.id,
|
||
"content": output,
|
||
})
|
||
```
|
||
|
||
**第 5 步**:把工具结果作为新消息追加,回到第 2 步。
|
||
|
||
```python
|
||
messages.append({"role": "user", "content": results})
|
||
```
|
||
|
||
组装为一个完整函数:
|
||
|
||
```python
|
||
def agent_loop(messages):
|
||
while True:
|
||
response = client.messages.create(
|
||
model=MODEL, system=SYSTEM, messages=messages,
|
||
tools=TOOLS, max_tokens=8000,
|
||
)
|
||
messages.append({"role": "assistant", "content": response.content})
|
||
|
||
if response.stop_reason != "tool_use":
|
||
return
|
||
|
||
results = []
|
||
for block in response.content:
|
||
if block.type == "tool_use":
|
||
output = run_bash(block.input["command"])
|
||
results.append({
|
||
"type": "tool_result",
|
||
"tool_use_id": block.id,
|
||
"content": output,
|
||
})
|
||
messages.append({"role": "user", "content": results})
|
||
```
|
||
|
||
不到 30 行,这就是最小可运行的 agent harness 内核。它不是智能本身,而是让模型能持续行动的最小运行框架,模型负责决策(要不要调工具、调哪个),harness 负责执行(调了就跑、结果喂回去)。后面 18 个章节都在这个循环上叠加机制,循环本身始终不变。
|
||
|
||
---
|
||
|
||
## 试一下
|
||
|
||
> **教学 demo 提示**:代码会执行模型生成的 shell 命令。建议在一个临时测试目录中运行,避免影响你的项目文件。s03 会讲真正的权限系统。
|
||
|
||
**准备**(首次运行):
|
||
|
||
```sh
|
||
pip install -r requirements.txt
|
||
cp .env.example .env
|
||
# 编辑 .env,填入 ANTHROPIC_API_KEY 和 MODEL_ID
|
||
```
|
||
|
||
**运行**:
|
||
|
||
```sh
|
||
python s01_agent_loop/code.py
|
||
```
|
||
|
||
试试这些 prompt:
|
||
|
||
1. `Create a file called hello.py that prints "Hello, World!"`
|
||
2. `List all Python files in this directory`
|
||
3. `What is the current git branch?`
|
||
|
||
观察重点:模型什么时候调用工具(循环继续),什么时候不调用(循环结束)?
|
||
|
||
---
|
||
|
||
## 接下来
|
||
|
||
现在模型手里只有 bash 一个工具,读文件要 `cat`,写文件要 `echo ... >`,找个文件要 `find`,又丑又容易出错。
|
||
|
||
s02 Tool Use → 给它 5 个真正的工具,会发生什么?模型会不会一次调用多个工具?几个工具同时跑会不会互相踩?
|
||
|
||
<details>
|
||
<summary>深入 CC 源码</summary>
|
||
|
||
> 以下内容基于 CC 源码 `src/query.ts`(1729 行)的核查。核心差异就两个:CC 不看 `stop_reason` 字段而是检查内容里有没有 tool_use 块(因为流式响应中 stop_reason 不可靠);CC 有更多的退出路径和恢复策略做生产级保护。
|
||
|
||
**教学版的 30 行 `while True` 就是 CC 1729 行的核心。** 下面每一项都是在这个核心上叠加的保护机制。
|
||
|
||
<details>
|
||
<summary>一、循环结构差异</summary>
|
||
|
||
教学版检查 `response.stop_reason`。CC 不把它作为循环继续的唯一依据——流式响应中 `stop_reason` 可能还没更新但内容里已经有 `tool_use` 块了。CC 用 `needsFollowUp` 标志:接收到流式消息时(`query.ts:830-834`),只要检测到 `tool_use` 块就设为 `true`;`QueryEngine.ts` 会从 `message_delta` 捕获真实 `stop_reason` 用于其他逻辑,但 query loop 本身靠 `needsFollowUp` 决定是否继续。
|
||
|
||
```typescript
|
||
// query.ts:554-558
|
||
// stop_reason === 'tool_use' is unreliable.
|
||
// Set during streaming whenever a tool_use block arrives.
|
||
let needsFollowUp = false
|
||
```
|
||
|
||
</details>
|
||
|
||
<details>
|
||
<summary>二、State 对象 10 字段(教学版只用 messages)</summary>
|
||
|
||
| # | 字段 | 用途 | 对应章节 |
|
||
|---|------|------|---------|
|
||
| 1 | `messages` | 当前迭代的消息数组 | s01 |
|
||
| 2 | `toolUseContext` | 工具、信号、权限上下文 | s02 |
|
||
| 3 | `autoCompactTracking` | 压缩状态追踪 | s08 |
|
||
| 4 | `maxOutputTokensRecoveryCount` | token 恢复尝试次数(上限 3) | s11 |
|
||
| 5 | `hasAttemptedReactiveCompact` | 本轮是否已尝试响应式压缩 | s08 |
|
||
| 6 | `maxOutputTokensOverride` | 8K→64K 的升级覆盖 | s11 |
|
||
| 7 | `pendingToolUseSummary` | 后台 Haiku 生成的 tool use 摘要 | s08 |
|
||
| 8 | `stopHookActive` | 停止钩子是否产生阻塞错误 | s04 |
|
||
| 9 | `turnCount` | 轮次计数(maxTurns 检查) | s01 |
|
||
| 10 | `transition` | 上一次继续原因 | s11 |
|
||
|
||
> 注:`taskBudgetRemaining`(`query.ts:291`)是 loop-local 局部变量,不在 State 上。源码注释明确写了 "Loop-local (not on State)"。
|
||
|
||
</details>
|
||
|
||
<details>
|
||
<summary>三、多条退出和继续路径</summary>
|
||
|
||
教学版只有 1 条退出路径(模型不调工具就结束)。生产版有多条退出和继续路径,覆盖 blocking limit、prompt too long、model error、abort、hook stop、max turns、token budget continuation、reactive compact retry 等场景。每种场景都有对应的恢复或退出策略。
|
||
|
||
</details>
|
||
|
||
<details>
|
||
<summary>四、流式工具执行和 QueryEngine</summary>
|
||
|
||
CC 的 `StreamingToolExecutor`(`query.ts:561`)让工具在模型还在生成时就开始并行执行(根据工具是否 concurrency-safe 决定并发或独占)。`QueryEngine.ts` 额外加了费用超限、结构化输出验证失败等保护。教学版不实现这些——目标是概念清晰,不是性能极致。
|
||
|
||
</details>
|
||
|
||
**一句话**:1729 行的 query.ts 核心就是 30 行 `while True`。所有复杂字段和退出路径都是保护机制。先理解核心循环,后面的一切自然展开。
|
||
|
||
</details>
|
||
|
||
<!-- translation-sync: zh@v1, en@v0, ja@v0 -->
|