mirror of
https://github.com/shareAI-lab/analysis_claude_code.git
synced 2026-05-06 16:26:16 +08:00
feat: build an AI agent from 0 to 1 -- 11 progressive sessions
- 11 sessions from basic agent loop to autonomous teams - Python MVP implementations for each session - Mental-model-first docs in en/zh/ja - Interactive web platform with step-through visualizations - Incremental architecture: each session adds one mechanism
This commit is contained in:
132
docs/zh/s01-the-agent-loop.md
Normal file
132
docs/zh/s01-the-agent-loop.md
Normal file
@@ -0,0 +1,132 @@
|
||||
# s01: Agent Loop (智能体循环)
|
||||
|
||||
> AI 编程智能体的全部秘密就是一个 while 循环 -- 把工具执行结果反馈给模型, 直到模型决定停止。
|
||||
|
||||
## 问题
|
||||
|
||||
为什么语言模型不能直接回答编程问题? 因为编程需要**与真实世界交互**。模型需要读取文件、运行测试、检查错误、反复迭代。单次的提示-响应交互无法做到这些。
|
||||
|
||||
没有 agent loop, 你就得手动把输出复制粘贴回模型。用户自己变成了那个循环。Agent loop 将这个过程自动化: 调用模型, 执行它要求的工具, 把结果送回去, 重复 -- 直到模型说 "我完成了"。
|
||||
|
||||
考虑一个简单任务: "创建一个打印 hello 的 Python 文件。" 模型需要 (1) 决定写文件, (2) 写入文件, (3) 验证是否正常工作。至少三次工具调用。没有循环的话, 每一次都需要人工干预。
|
||||
|
||||
## 解决方案
|
||||
|
||||
```
|
||||
+----------+ +-------+ +---------+
|
||||
| User | ---> | LLM | ---> | Tool |
|
||||
| prompt | | | | execute |
|
||||
+----------+ +---+---+ +----+----+
|
||||
^ |
|
||||
| tool_result |
|
||||
+---------------+
|
||||
(loop continues)
|
||||
|
||||
The loop terminates when stop_reason != "tool_use".
|
||||
That single condition is the entire control flow.
|
||||
```
|
||||
|
||||
## 工作原理
|
||||
|
||||
1. 用户提供一个 prompt, 成为第一条消息。
|
||||
|
||||
```python
|
||||
history.append({"role": "user", "content": query})
|
||||
```
|
||||
|
||||
2. 消息数组连同工具定义一起发送给 LLM。
|
||||
|
||||
```python
|
||||
response = client.messages.create(
|
||||
model=MODEL, system=SYSTEM, messages=messages,
|
||||
tools=TOOLS, max_tokens=8000,
|
||||
)
|
||||
```
|
||||
|
||||
3. 助手的响应被追加到消息列表中。
|
||||
|
||||
```python
|
||||
messages.append({"role": "assistant", "content": response.content})
|
||||
```
|
||||
|
||||
4. 检查 stop_reason。如果模型没有调用工具, 循环结束。这是唯一的退出条件。
|
||||
|
||||
```python
|
||||
if response.stop_reason != "tool_use":
|
||||
return
|
||||
```
|
||||
|
||||
5. 对响应中的每个 tool_use 块, 执行工具 (本节课中是 bash) 并收集结果。
|
||||
|
||||
```python
|
||||
for block in response.content:
|
||||
if block.type == "tool_use":
|
||||
output = run_bash(block.input["command"])
|
||||
results.append({
|
||||
"type": "tool_result",
|
||||
"tool_use_id": block.id,
|
||||
"content": output,
|
||||
})
|
||||
```
|
||||
|
||||
6. 结果作为 user 消息追加, 循环继续。
|
||||
|
||||
```python
|
||||
messages.append({"role": "user", "content": results})
|
||||
```
|
||||
|
||||
## 核心代码
|
||||
|
||||
最小可行智能体 -- 不到 30 行代码实现整个模式
|
||||
(来自 `agents/s01_agent_loop.py`, 第 66-86 行):
|
||||
|
||||
```python
|
||||
def agent_loop(messages: list):
|
||||
while True:
|
||||
response = client.messages.create(
|
||||
model=MODEL, system=SYSTEM, messages=messages,
|
||||
tools=TOOLS, max_tokens=8000,
|
||||
)
|
||||
messages.append({"role": "assistant", "content": response.content})
|
||||
if response.stop_reason != "tool_use":
|
||||
return
|
||||
results = []
|
||||
for block in response.content:
|
||||
if block.type == "tool_use":
|
||||
output = run_bash(block.input["command"])
|
||||
results.append({
|
||||
"type": "tool_result",
|
||||
"tool_use_id": block.id,
|
||||
"content": output,
|
||||
})
|
||||
messages.append({"role": "user", "content": results})
|
||||
```
|
||||
|
||||
## 变更内容
|
||||
|
||||
这是第 1 节课 -- 起点。没有前置课程。
|
||||
|
||||
| 组件 | 之前 | 之后 |
|
||||
|---------------|------------|--------------------------------|
|
||||
| Agent loop | (无) | `while True` + stop_reason |
|
||||
| Tools | (无) | `bash` (单一工具) |
|
||||
| Messages | (无) | 累积式消息列表 |
|
||||
| Control flow | (无) | `stop_reason != "tool_use"` |
|
||||
|
||||
## 设计原理
|
||||
|
||||
这个循环是所有基于 LLM 的智能体的通用基础。生产实现会增加错误处理、token 计数、流式输出和重试逻辑, 但基本结构不变。简洁性就是重点: 一个退出条件 (`stop_reason != "tool_use"`) 控制整个流程。本课程中的所有其他内容 -- 工具、规划、压缩、团队 -- 都是在这个循环之上叠加, 而不修改它。理解这个循环就是理解所有智能体。
|
||||
|
||||
## 试一试
|
||||
|
||||
```sh
|
||||
cd learn-claude-code
|
||||
python agents/s01_agent_loop.py
|
||||
```
|
||||
|
||||
可以尝试的提示:
|
||||
|
||||
1. `Create a file called hello.py that prints "Hello, World!"`
|
||||
2. `List all Python files in this directory`
|
||||
3. `What is the current git branch?`
|
||||
4. `Create a directory called test_output and write 3 files in it`
|
||||
141
docs/zh/s02-tool-use.md
Normal file
141
docs/zh/s02-tool-use.md
Normal file
@@ -0,0 +1,141 @@
|
||||
# s02: Tools (工具)
|
||||
|
||||
> 一个分发映射表 (dispatch map) 将工具调用路由到处理函数 -- 循环本身完全不需要改动。
|
||||
|
||||
## 问题
|
||||
|
||||
只有 `bash` 时, 智能体所有操作都通过 shell: 读文件、写文件、编辑文件。这能用但很脆弱。`cat` 的输出会被不可预测地截断。`sed` 替换遇到特殊字符就会失败。模型浪费大量 token 构造 shell 管道, 而一个直接的函数调用会简单得多。
|
||||
|
||||
更重要的是, bash 是一个安全攻击面。每次 bash 调用都能做 shell 能做的一切。有了专用工具如 `read_file` 和 `write_file`, 你可以在工具层面强制路径沙箱化, 阻止危险模式, 而不是寄希望于模型自觉回避。
|
||||
|
||||
关键洞察: 添加工具不需要修改循环。s01 的循环保持不变。你只需在工具数组中添加条目, 编写处理函数, 然后通过 dispatch map 把它们关联起来。
|
||||
|
||||
## 解决方案
|
||||
|
||||
```
|
||||
+----------+ +-------+ +------------------+
|
||||
| User | ---> | LLM | ---> | Tool Dispatch |
|
||||
| prompt | | | | { |
|
||||
+----------+ +---+---+ | bash: run_bash |
|
||||
^ | read: run_read |
|
||||
| | write: run_wr |
|
||||
+----------+ edit: run_edit |
|
||||
tool_result| } |
|
||||
+------------------+
|
||||
|
||||
The dispatch map is a dict: {tool_name: handler_function}
|
||||
One lookup replaces any if/elif chain.
|
||||
```
|
||||
|
||||
## 工作原理
|
||||
|
||||
1. 为每个工具定义处理函数。每个函数接受与工具 input_schema 对应的关键字参数, 返回字符串结果。
|
||||
|
||||
```python
|
||||
def run_read(path: str, limit: int = None) -> str:
|
||||
text = safe_path(path).read_text()
|
||||
lines = text.splitlines()
|
||||
if limit and limit < len(lines):
|
||||
lines = lines[:limit]
|
||||
return "\n".join(lines)[:50000]
|
||||
```
|
||||
|
||||
2. 创建 dispatch map, 将工具名映射到处理函数。
|
||||
|
||||
```python
|
||||
TOOL_HANDLERS = {
|
||||
"bash": lambda **kw: run_bash(kw["command"]),
|
||||
"read_file": lambda **kw: run_read(kw["path"], kw.get("limit")),
|
||||
"write_file": lambda **kw: run_write(kw["path"], kw["content"]),
|
||||
"edit_file": lambda **kw: run_edit(kw["path"], kw["old_text"],
|
||||
kw["new_text"]),
|
||||
}
|
||||
```
|
||||
|
||||
3. 在 agent loop 中, 按名称查找处理函数, 而不是硬编码。
|
||||
|
||||
```python
|
||||
for block in response.content:
|
||||
if block.type == "tool_use":
|
||||
handler = TOOL_HANDLERS.get(block.name)
|
||||
output = handler(**block.input)
|
||||
results.append({
|
||||
"type": "tool_result",
|
||||
"tool_use_id": block.id,
|
||||
"content": output,
|
||||
})
|
||||
```
|
||||
|
||||
4. 路径沙箱化防止模型逃逸出工作区。
|
||||
|
||||
```python
|
||||
def safe_path(p: str) -> Path:
|
||||
path = (WORKDIR / p).resolve()
|
||||
if not path.is_relative_to(WORKDIR):
|
||||
raise ValueError(f"Path escapes workspace: {p}")
|
||||
return path
|
||||
```
|
||||
|
||||
## 核心代码
|
||||
|
||||
dispatch 模式 (来自 `agents/s02_tool_use.py`, 第 93-129 行):
|
||||
|
||||
```python
|
||||
TOOL_HANDLERS = {
|
||||
"bash": lambda **kw: run_bash(kw["command"]),
|
||||
"read_file": lambda **kw: run_read(kw["path"], kw.get("limit")),
|
||||
"write_file": lambda **kw: run_write(kw["path"], kw["content"]),
|
||||
"edit_file": lambda **kw: run_edit(kw["path"], kw["old_text"],
|
||||
kw["new_text"]),
|
||||
}
|
||||
|
||||
def agent_loop(messages: list):
|
||||
while True:
|
||||
response = client.messages.create(
|
||||
model=MODEL, system=SYSTEM, messages=messages,
|
||||
tools=TOOLS, max_tokens=8000,
|
||||
)
|
||||
messages.append({"role": "assistant", "content": response.content})
|
||||
if response.stop_reason != "tool_use":
|
||||
return
|
||||
results = []
|
||||
for block in response.content:
|
||||
if block.type == "tool_use":
|
||||
handler = TOOL_HANDLERS.get(block.name)
|
||||
output = handler(**block.input) if handler \
|
||||
else f"Unknown tool: {block.name}"
|
||||
results.append({
|
||||
"type": "tool_result",
|
||||
"tool_use_id": block.id,
|
||||
"content": output,
|
||||
})
|
||||
messages.append({"role": "user", "content": results})
|
||||
```
|
||||
|
||||
## 相对 s01 的变更
|
||||
|
||||
| 组件 | 之前 (s01) | 之后 (s02) |
|
||||
|----------------|--------------------|----------------------------|
|
||||
| Tools | 1 (仅 bash) | 4 (bash, read, write, edit)|
|
||||
| Dispatch | 硬编码 bash 调用 | `TOOL_HANDLERS` 字典 |
|
||||
| 路径安全 | 无 | `safe_path()` 沙箱 |
|
||||
| Agent loop | 不变 | 不变 |
|
||||
|
||||
## 设计原理
|
||||
|
||||
dispatch map 模式可以线性扩展 -- 添加工具只需添加一个处理函数和一个 schema 条目。循环永远不需要改动。这种关注点分离 (循环 vs 处理函数) 是智能体框架能支持数十个工具而不增加控制流复杂度的原因。该模式还支持对每个处理函数进行独立测试, 因为处理函数是与循环无耦合的纯函数。任何超出 dispatch map 的智能体都是设计问题, 而非扩展问题。
|
||||
|
||||
## 试一试
|
||||
|
||||
```sh
|
||||
cd learn-claude-code
|
||||
python agents/s02_tool_use.py
|
||||
```
|
||||
|
||||
可以尝试的提示:
|
||||
|
||||
1. `Read the file requirements.txt`
|
||||
2. `Create a file called greet.py with a greet(name) function`
|
||||
3. `Edit greet.py to add a docstring to the function`
|
||||
4. `Read greet.py to verify the edit worked`
|
||||
5. `Run the greet function with bash: python -c "from greet import greet; greet('World')"`
|
||||
157
docs/zh/s03-todo-write.md
Normal file
157
docs/zh/s03-todo-write.md
Normal file
@@ -0,0 +1,157 @@
|
||||
# s03: TodoWrite (待办写入)
|
||||
|
||||
> TodoManager 让智能体能追踪自己的进度, 而 nag reminder 注入机制在它忘记更新时强制提醒。
|
||||
|
||||
## 问题
|
||||
|
||||
当智能体处理多步骤任务时, 它经常丢失对已完成和待办事项的追踪。没有显式的计划, 模型可能重复工作、跳过步骤或跑偏。用户也无法看到智能体内部的计划。
|
||||
|
||||
这个问题比听起来更严重。长对话会导致模型 "漂移" -- 随着上下文窗口被工具结果填满, 系统提示的影响力逐渐减弱。一个 10 步的重构任务可能完成了 1-3 步, 然后模型就开始即兴发挥, 因为它忘了第 4-10 步的存在。
|
||||
|
||||
解决方案是结构化状态: 一个模型显式写入的 TodoManager。模型创建计划, 工作时将项目标记为 in_progress, 完成后标记为 completed。nag reminder 机制在模型连续 3 轮以上不更新待办时注入提醒。
|
||||
|
||||
教学简化说明: 这里 nag 阈值设为 3 轮是为了教学可见性。生产环境的智能体通常使用约 10 轮的阈值以避免过度提醒。
|
||||
|
||||
## 解决方案
|
||||
|
||||
```
|
||||
+----------+ +-------+ +---------+
|
||||
| User | ---> | LLM | ---> | Tools |
|
||||
| prompt | | | | + todo |
|
||||
+----------+ +---+---+ +----+----+
|
||||
^ |
|
||||
| tool_result |
|
||||
+---------------+
|
||||
|
|
||||
+-----------+-----------+
|
||||
| TodoManager state |
|
||||
| [ ] task A |
|
||||
| [>] task B <- doing |
|
||||
| [x] task C |
|
||||
+-----------------------+
|
||||
|
|
||||
if rounds_since_todo >= 3:
|
||||
inject <reminder> into tool_result
|
||||
```
|
||||
|
||||
## 工作原理
|
||||
|
||||
1. TodoManager 验证并存储一组带状态的项目。同一时间只允许一个项目处于 `in_progress` 状态。
|
||||
|
||||
```python
|
||||
class TodoManager:
|
||||
def __init__(self):
|
||||
self.items = []
|
||||
|
||||
def update(self, items: list) -> str:
|
||||
validated = []
|
||||
in_progress_count = 0
|
||||
for item in items:
|
||||
status = item.get("status", "pending")
|
||||
if status == "in_progress":
|
||||
in_progress_count += 1
|
||||
validated.append({
|
||||
"id": item["id"],
|
||||
"text": item["text"],
|
||||
"status": status,
|
||||
})
|
||||
if in_progress_count > 1:
|
||||
raise ValueError("Only one task can be in_progress")
|
||||
self.items = validated
|
||||
return self.render()
|
||||
```
|
||||
|
||||
2. `todo` 工具和其他工具一样添加到 dispatch map 中。
|
||||
|
||||
```python
|
||||
TOOL_HANDLERS = {
|
||||
"bash": lambda **kw: run_bash(kw["command"]),
|
||||
# ...other tools...
|
||||
"todo": lambda **kw: TODO.update(kw["items"]),
|
||||
}
|
||||
```
|
||||
|
||||
3. nag reminder 在模型连续 3 轮以上不调用 `todo` 时, 向 tool_result 消息中注入 `<reminder>` 标签。
|
||||
|
||||
```python
|
||||
def agent_loop(messages: list):
|
||||
rounds_since_todo = 0
|
||||
while True:
|
||||
if rounds_since_todo >= 3 and messages:
|
||||
last = messages[-1]
|
||||
if (last["role"] == "user"
|
||||
and isinstance(last.get("content"), list)):
|
||||
last["content"].insert(0, {
|
||||
"type": "text",
|
||||
"text": "<reminder>Update your todos.</reminder>",
|
||||
})
|
||||
# ... rest of loop ...
|
||||
rounds_since_todo = 0 if used_todo else rounds_since_todo + 1
|
||||
```
|
||||
|
||||
4. 系统提示指导模型使用 todo 进行规划。
|
||||
|
||||
```python
|
||||
SYSTEM = f"""You are a coding agent at {WORKDIR}.
|
||||
Use the todo tool to plan multi-step tasks.
|
||||
Mark in_progress before starting, completed when done.
|
||||
Prefer tools over prose."""
|
||||
```
|
||||
|
||||
## 核心代码
|
||||
|
||||
TodoManager 和 nag 注入 (来自 `agents/s03_todo_write.py`,
|
||||
第 51-85 行和第 158-187 行):
|
||||
|
||||
```python
|
||||
class TodoManager:
|
||||
def update(self, items: list) -> str:
|
||||
validated = []
|
||||
in_progress_count = 0
|
||||
for item in items:
|
||||
status = item.get("status", "pending")
|
||||
if status == "in_progress":
|
||||
in_progress_count += 1
|
||||
validated.append({
|
||||
"id": item["id"],
|
||||
"text": item["text"],
|
||||
"status": status,
|
||||
})
|
||||
if in_progress_count > 1:
|
||||
raise ValueError("Only one in_progress")
|
||||
self.items = validated
|
||||
return self.render()
|
||||
|
||||
# In agent_loop:
|
||||
if rounds_since_todo >= 3:
|
||||
last["content"].insert(0, {
|
||||
"type": "text",
|
||||
"text": "<reminder>Update your todos.</reminder>",
|
||||
})
|
||||
```
|
||||
|
||||
## 相对 s02 的变更
|
||||
|
||||
| 组件 | 之前 (s02) | 之后 (s03) |
|
||||
|----------------|------------------|--------------------------|
|
||||
| Tools | 4 | 5 (+todo) |
|
||||
| 规划 | 无 | 带状态的 TodoManager |
|
||||
| Nag 注入 | 无 | 3 轮后注入 `<reminder>` |
|
||||
| Agent loop | 简单分发 | + rounds_since_todo 计数器|
|
||||
|
||||
## 设计原理
|
||||
|
||||
可见的计划能提高任务完成率, 因为模型可以自我监控进度。nag 机制创造了问责性 -- 没有它, 随着对话上下文增长和早期指令淡化, 模型可能在执行中途放弃计划。"同一时间只允许一个 in_progress" 的约束强制顺序聚焦, 防止上下文切换开销降低输出质量。这个模式之所以有效, 是因为它将模型的工作记忆外化为结构化状态, 使其能够在注意力漂移中存活。
|
||||
|
||||
## 试一试
|
||||
|
||||
```sh
|
||||
cd learn-claude-code
|
||||
python agents/s03_todo_write.py
|
||||
```
|
||||
|
||||
可以尝试的提示:
|
||||
|
||||
1. `Refactor the file hello.py: add type hints, docstrings, and a main guard`
|
||||
2. `Create a Python package with __init__.py, utils.py, and tests/test_utils.py`
|
||||
3. `Review all Python files and fix any style issues`
|
||||
144
docs/zh/s04-subagent.md
Normal file
144
docs/zh/s04-subagent.md
Normal file
@@ -0,0 +1,144 @@
|
||||
# s04: Subagent (子智能体)
|
||||
|
||||
> 子智能体使用全新的消息列表运行, 与父智能体共享文件系统, 仅返回摘要 -- 保持父上下文的整洁。
|
||||
|
||||
## 问题
|
||||
|
||||
随着智能体工作, 它的消息数组不断增长。每次工具调用、每次文件读取、每次 bash 输出都在累积。20-30 次工具调用后, 上下文窗口充满了无关的历史。为了回答一个简单问题而读取的 500 行文件, 会永久占据上下文中的 500 行空间。
|
||||
|
||||
这对探索性任务尤其糟糕。"这个项目用了什么测试框架?" 可能需要读取 5 个文件, 但父智能体的历史中并不需要这 5 个文件的全部内容 -- 它只需要答案: "pytest, 使用 conftest.py 配置。"
|
||||
|
||||
解决方案是进程隔离: 以 `messages=[]` 启动一个子智能体。子智能体进行探索、读取文件、运行命令。完成后, 只有最终的文本响应返回给父智能体。子智能体的全部消息历史被丢弃。
|
||||
|
||||
## 解决方案
|
||||
|
||||
```
|
||||
Parent agent Subagent
|
||||
+------------------+ +------------------+
|
||||
| messages=[...] | | messages=[] | <-- fresh
|
||||
| | dispatch | |
|
||||
| tool: task | ---------->| while tool_use: |
|
||||
| prompt="..." | | call tools |
|
||||
| | summary | append results |
|
||||
| result = "..." | <--------- | return last text |
|
||||
+------------------+ +------------------+
|
||||
|
|
||||
Parent context stays clean.
|
||||
Subagent context is discarded.
|
||||
```
|
||||
|
||||
## 工作原理
|
||||
|
||||
1. 父智能体拥有一个 `task` 工具用于触发子智能体的生成。子智能体获得除 `task` 外的所有基础工具 (不允许递归生成)。
|
||||
|
||||
```python
|
||||
PARENT_TOOLS = CHILD_TOOLS + [
|
||||
{"name": "task",
|
||||
"description": "Spawn a subagent with fresh context.",
|
||||
"input_schema": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"prompt": {"type": "string"},
|
||||
"description": {"type": "string"},
|
||||
},
|
||||
"required": ["prompt"],
|
||||
}},
|
||||
]
|
||||
```
|
||||
|
||||
2. 子智能体以全新的消息列表启动, 仅包含委派的 prompt。它共享相同的文件系统。
|
||||
|
||||
```python
|
||||
def run_subagent(prompt: str) -> str:
|
||||
sub_messages = [{"role": "user", "content": prompt}]
|
||||
for _ in range(30): # safety limit
|
||||
response = client.messages.create(
|
||||
model=MODEL, system=SUBAGENT_SYSTEM,
|
||||
messages=sub_messages,
|
||||
tools=CHILD_TOOLS, max_tokens=8000,
|
||||
)
|
||||
sub_messages.append({
|
||||
"role": "assistant", "content": response.content
|
||||
})
|
||||
if response.stop_reason != "tool_use":
|
||||
break
|
||||
# execute tools, append results...
|
||||
```
|
||||
|
||||
3. 只有最终文本返回给父智能体。子智能体 30+ 次工具调用的历史被丢弃。
|
||||
|
||||
```python
|
||||
return "".join(
|
||||
b.text for b in response.content if hasattr(b, "text")
|
||||
) or "(no summary)"
|
||||
```
|
||||
|
||||
4. 父智能体将此摘要作为普通的 tool_result 接收。
|
||||
|
||||
```python
|
||||
if block.name == "task":
|
||||
output = run_subagent(block.input["prompt"])
|
||||
results.append({
|
||||
"type": "tool_result",
|
||||
"tool_use_id": block.id,
|
||||
"content": str(output),
|
||||
})
|
||||
```
|
||||
|
||||
## 核心代码
|
||||
|
||||
子智能体函数 (来自 `agents/s04_subagent.py`, 第 110-128 行):
|
||||
|
||||
```python
|
||||
def run_subagent(prompt: str) -> str:
|
||||
sub_messages = [{"role": "user", "content": prompt}]
|
||||
for _ in range(30):
|
||||
response = client.messages.create(
|
||||
model=MODEL, system=SUBAGENT_SYSTEM,
|
||||
messages=sub_messages,
|
||||
tools=CHILD_TOOLS, max_tokens=8000,
|
||||
)
|
||||
sub_messages.append({"role": "assistant",
|
||||
"content": response.content})
|
||||
if response.stop_reason != "tool_use":
|
||||
break
|
||||
results = []
|
||||
for block in response.content:
|
||||
if block.type == "tool_use":
|
||||
handler = TOOL_HANDLERS.get(block.name)
|
||||
output = handler(**block.input)
|
||||
results.append({"type": "tool_result",
|
||||
"tool_use_id": block.id,
|
||||
"content": str(output)[:50000]})
|
||||
sub_messages.append({"role": "user", "content": results})
|
||||
return "".join(
|
||||
b.text for b in response.content if hasattr(b, "text")
|
||||
) or "(no summary)"
|
||||
```
|
||||
|
||||
## 相对 s03 的变更
|
||||
|
||||
| 组件 | 之前 (s03) | 之后 (s04) |
|
||||
|----------------|------------------|---------------------------|
|
||||
| Tools | 5 | 5 (基础) + task (仅父端) |
|
||||
| 上下文 | 单一共享 | 父 + 子隔离 |
|
||||
| Subagent | 无 | `run_subagent()` 函数 |
|
||||
| 返回值 | 不适用 | 仅摘要文本 |
|
||||
| Todo 系统 | TodoManager | 已移除 (非本节重点) |
|
||||
|
||||
## 设计原理
|
||||
|
||||
进程隔离免费提供了上下文隔离。全新的 `messages[]` 意味着子智能体不会被父级的对话历史干扰。代价是通信开销 -- 结果必须压缩回父级, 丢失细节。这与操作系统进程隔离的权衡相同: 用序列化成本换取安全性和整洁性。限制子智能体深度 (不允许递归生成) 防止无限资源消耗, 最大迭代次数确保失控的子进程能终止。
|
||||
|
||||
## 试一试
|
||||
|
||||
```sh
|
||||
cd learn-claude-code
|
||||
python agents/s04_subagent.py
|
||||
```
|
||||
|
||||
可以尝试的提示:
|
||||
|
||||
1. `Use a subtask to find what testing framework this project uses`
|
||||
2. `Delegate: read all .py files and summarize what each one does`
|
||||
3. `Use a task to create a new module, then verify it from here`
|
||||
153
docs/zh/s05-skill-loading.md
Normal file
153
docs/zh/s05-skill-loading.md
Normal file
@@ -0,0 +1,153 @@
|
||||
# s05: Skills (技能加载)
|
||||
|
||||
> 两层技能注入避免了系统提示膨胀: 在系统提示中放技能名称 (低成本), 在 tool_result 中按需放入完整技能内容。
|
||||
|
||||
## 问题
|
||||
|
||||
你希望智能体针对不同领域遵循特定的工作流: git 约定、测试模式、代码审查清单。简单粗暴的做法是把所有内容都塞进系统提示。但系统提示的有效注意力是有限的 -- 文本太多, 模型就会开始忽略其中一部分。
|
||||
|
||||
如果你有 10 个技能, 每个 2000 token, 那就是 20,000 token 的系统提示。模型关注开头和结尾, 但会略过中间部分。更糟糕的是, 这些技能中大部分与当前任务无关。文件编辑任务不需要 git 工作流说明。
|
||||
|
||||
两层方案解决了这个问题: 第一层在系统提示中放入简短的技能描述 (每个技能约 100 token)。第二层只在模型调用 `load_skill` 时, 才将完整的技能内容加载到 tool_result 中。模型知道有哪些技能可用 (低成本), 按需加载它们 (只在相关时)。
|
||||
|
||||
## 解决方案
|
||||
|
||||
```
|
||||
System prompt (Layer 1 -- always present):
|
||||
+--------------------------------------+
|
||||
| You are a coding agent. |
|
||||
| Skills available: |
|
||||
| - git: Git workflow helpers | ~100 tokens/skill
|
||||
| - test: Testing best practices |
|
||||
+--------------------------------------+
|
||||
|
||||
When model calls load_skill("git"):
|
||||
+--------------------------------------+
|
||||
| tool_result (Layer 2 -- on demand): |
|
||||
| <skill name="git"> |
|
||||
| Full git workflow instructions... | ~2000 tokens
|
||||
| Step 1: ... |
|
||||
| Step 2: ... |
|
||||
| </skill> |
|
||||
+--------------------------------------+
|
||||
```
|
||||
|
||||
## 工作原理
|
||||
|
||||
1. 技能文件以 Markdown 格式存放在 `.skills/` 目录中, 带有 YAML frontmatter。
|
||||
|
||||
```
|
||||
.skills/
|
||||
git.md # ---\n description: Git workflow\n ---\n ...
|
||||
test.md # ---\n description: Testing patterns\n ---\n ...
|
||||
```
|
||||
|
||||
2. SkillLoader 解析 frontmatter, 分离元数据和正文。
|
||||
|
||||
```python
|
||||
class SkillLoader:
|
||||
def _parse_frontmatter(self, text: str) -> tuple:
|
||||
match = re.match(
|
||||
r"^---\n(.*?)\n---\n(.*)", text, re.DOTALL
|
||||
)
|
||||
if not match:
|
||||
return {}, text
|
||||
meta = {}
|
||||
for line in match.group(1).strip().splitlines():
|
||||
if ":" in line:
|
||||
key, val = line.split(":", 1)
|
||||
meta[key.strip()] = val.strip()
|
||||
return meta, match.group(2).strip()
|
||||
```
|
||||
|
||||
3. 第一层: `get_descriptions()` 返回简短描述, 用于系统提示。
|
||||
|
||||
```python
|
||||
def get_descriptions(self) -> str:
|
||||
lines = []
|
||||
for name, skill in self.skills.items():
|
||||
desc = skill["meta"].get("description", "No description")
|
||||
lines.append(f" - {name}: {desc}")
|
||||
return "\n".join(lines)
|
||||
|
||||
SYSTEM = f"""You are a coding agent at {WORKDIR}.
|
||||
Skills available:
|
||||
{SKILL_LOADER.get_descriptions()}"""
|
||||
```
|
||||
|
||||
4. 第二层: `get_content()` 返回用 `<skill>` 标签包裹的完整正文。
|
||||
|
||||
```python
|
||||
def get_content(self, name: str) -> str:
|
||||
skill = self.skills.get(name)
|
||||
if not skill:
|
||||
return f"Error: Unknown skill '{name}'."
|
||||
return f"<skill name=\"{name}\">\n{skill['body']}\n</skill>"
|
||||
```
|
||||
|
||||
5. `load_skill` 工具只是 dispatch map 中的又一个条目。
|
||||
|
||||
```python
|
||||
TOOL_HANDLERS = {
|
||||
# ...base tools...
|
||||
"load_skill": lambda **kw: SKILL_LOADER.get_content(kw["name"]),
|
||||
}
|
||||
```
|
||||
|
||||
## 核心代码
|
||||
|
||||
SkillLoader 类 (来自 `agents/s05_skill_loading.py`, 第 51-97 行):
|
||||
|
||||
```python
|
||||
class SkillLoader:
|
||||
def __init__(self, skills_dir: Path):
|
||||
self.skills = {}
|
||||
for f in sorted(skills_dir.glob("*.md")):
|
||||
text = f.read_text()
|
||||
meta, body = self._parse_frontmatter(text)
|
||||
self.skills[f.stem] = {
|
||||
"meta": meta, "body": body
|
||||
}
|
||||
|
||||
def get_descriptions(self) -> str:
|
||||
lines = []
|
||||
for name, skill in self.skills.items():
|
||||
desc = skill["meta"].get("description", "")
|
||||
lines.append(f" - {name}: {desc}")
|
||||
return "\n".join(lines)
|
||||
|
||||
def get_content(self, name: str) -> str:
|
||||
skill = self.skills.get(name)
|
||||
if not skill:
|
||||
return f"Error: Unknown skill '{name}'."
|
||||
return (f"<skill name=\"{name}\">\n"
|
||||
f"{skill['body']}\n</skill>")
|
||||
```
|
||||
|
||||
## 相对 s04 的变更
|
||||
|
||||
| 组件 | 之前 (s04) | 之后 (s05) |
|
||||
|----------------|------------------|----------------------------|
|
||||
| Tools | 5 (基础 + task) | 5 (基础 + load_skill) |
|
||||
| 系统提示 | 静态字符串 | + 技能描述列表 |
|
||||
| 知识库 | 无 | .skills/*.md 文件 |
|
||||
| 注入方式 | 无 | 两层 (系统提示 + result) |
|
||||
| Subagent | `run_subagent()` | 已移除 (非本节重点) |
|
||||
|
||||
## 设计原理
|
||||
|
||||
两层注入解决了注意力预算问题。将所有技能内容放入系统提示会在未使用的技能上浪费 token。第一层 (紧凑摘要) 总共约 120 token。第二层 (完整内容) 通过 tool_result 按需加载。这可以扩展到数十个技能而不降低模型注意力质量。关键洞察是: 模型只需要知道有哪些技能 (低成本) 就能决定何时加载某个技能 (高成本)。这与软件模块系统中的懒加载原则相同。
|
||||
|
||||
## 试一试
|
||||
|
||||
```sh
|
||||
cd learn-claude-code
|
||||
python agents/s05_skill_loading.py
|
||||
```
|
||||
|
||||
可以尝试的提示:
|
||||
|
||||
1. `What skills are available?`
|
||||
2. `Load the agent-builder skill and follow its instructions`
|
||||
3. `I need to do a code review -- load the relevant skill first`
|
||||
4. `Build an MCP server using the mcp-builder skill`
|
||||
170
docs/zh/s06-context-compact.md
Normal file
170
docs/zh/s06-context-compact.md
Normal file
@@ -0,0 +1,170 @@
|
||||
# s06: Compact (上下文压缩)
|
||||
|
||||
> 三层压缩管道让智能体可以无限期工作: 策略性地遗忘旧的工具结果, token 超过阈值时自动摘要, 以及支持手动触发压缩。
|
||||
|
||||
## 问题
|
||||
|
||||
上下文窗口是有限的。工具调用积累到足够多时, 消息数组会超过模型的上下文限制, API 调用直接失败。即使在到达硬限制之前, 性能也会下降: 模型变慢、准确率降低, 开始忽略早期消息。
|
||||
|
||||
200,000 token 的上下文窗口听起来很大, 但一次 `read_file` 读取 1000 行源文件就消耗约 4000 token。读取 30 个文件、运行 20 条 bash 命令后, 你就已经用掉 100,000+ token 了。没有某种压缩机制, 智能体无法在大型代码库上工作。
|
||||
|
||||
三层管道以递增的激进程度来应对这个问题:
|
||||
第一层 (micro-compact) 每轮静默替换旧的工具结果。
|
||||
第二层 (auto-compact) 在 token 超过阈值时触发完整摘要。
|
||||
第三层 (manual compact) 让模型自己触发压缩。
|
||||
|
||||
教学简化说明: 这里的 token 估算使用粗略的 字符数/4 启发式方法。生产系统使用专业的 tokenizer 库进行精确计数。
|
||||
|
||||
## 解决方案
|
||||
|
||||
```
|
||||
Every turn:
|
||||
+------------------+
|
||||
| Tool call result |
|
||||
+------------------+
|
||||
|
|
||||
v
|
||||
[Layer 1: micro_compact] (silent, every turn)
|
||||
Replace tool_result > 3 turns old
|
||||
with "[Previous: used {tool_name}]"
|
||||
|
|
||||
v
|
||||
[Check: tokens > 50000?]
|
||||
| |
|
||||
no yes
|
||||
| |
|
||||
v v
|
||||
continue [Layer 2: auto_compact]
|
||||
Save transcript to .transcripts/
|
||||
LLM summarizes conversation.
|
||||
Replace all messages with [summary].
|
||||
|
|
||||
v
|
||||
[Layer 3: compact tool]
|
||||
Model calls compact explicitly.
|
||||
Same summarization as auto_compact.
|
||||
```
|
||||
|
||||
## 工作原理
|
||||
|
||||
1. **第一层 -- micro_compact**: 每次 LLM 调用前, 找到最近 3 条之前的所有 tool_result 条目, 替换其内容。
|
||||
|
||||
```python
|
||||
def micro_compact(messages: list) -> list:
|
||||
tool_results = []
|
||||
for i, msg in enumerate(messages):
|
||||
if msg["role"] == "user" and isinstance(msg.get("content"), list):
|
||||
for j, part in enumerate(msg["content"]):
|
||||
if isinstance(part, dict) and part.get("type") == "tool_result":
|
||||
tool_results.append((i, j, part))
|
||||
if len(tool_results) <= KEEP_RECENT:
|
||||
return messages
|
||||
to_clear = tool_results[:-KEEP_RECENT]
|
||||
for _, _, part in to_clear:
|
||||
if len(part.get("content", "")) > 100:
|
||||
tool_id = part.get("tool_use_id", "")
|
||||
tool_name = tool_name_map.get(tool_id, "unknown")
|
||||
part["content"] = f"[Previous: used {tool_name}]"
|
||||
return messages
|
||||
```
|
||||
|
||||
2. **第二层 -- auto_compact**: 当估算 token 超过 50,000 时, 保存完整对话记录并请求 LLM 进行摘要。
|
||||
|
||||
```python
|
||||
def auto_compact(messages: list) -> list:
|
||||
TRANSCRIPT_DIR.mkdir(exist_ok=True)
|
||||
transcript_path = TRANSCRIPT_DIR / f"transcript_{int(time.time())}.jsonl"
|
||||
with open(transcript_path, "w") as f:
|
||||
for msg in messages:
|
||||
f.write(json.dumps(msg, default=str) + "\n")
|
||||
response = client.messages.create(
|
||||
model=MODEL,
|
||||
messages=[{"role": "user", "content":
|
||||
"Summarize this conversation for continuity..."
|
||||
+ json.dumps(messages, default=str)[:80000]}],
|
||||
max_tokens=2000,
|
||||
)
|
||||
summary = response.content[0].text
|
||||
return [
|
||||
{"role": "user", "content": f"[Compressed]\n\n{summary}"},
|
||||
{"role": "assistant", "content": "Understood. Continuing."},
|
||||
]
|
||||
```
|
||||
|
||||
3. **第三层 -- manual compact**: `compact` 工具按需触发相同的摘要机制。
|
||||
|
||||
```python
|
||||
if manual_compact:
|
||||
messages[:] = auto_compact(messages)
|
||||
```
|
||||
|
||||
4. Agent loop 整合了全部三层。
|
||||
|
||||
```python
|
||||
def agent_loop(messages: list):
|
||||
while True:
|
||||
micro_compact(messages)
|
||||
if estimate_tokens(messages) > THRESHOLD:
|
||||
messages[:] = auto_compact(messages)
|
||||
response = client.messages.create(...)
|
||||
# ... tool execution ...
|
||||
if manual_compact:
|
||||
messages[:] = auto_compact(messages)
|
||||
```
|
||||
|
||||
## 核心代码
|
||||
|
||||
三层管道 (来自 `agents/s06_context_compact.py`, 第 67-93 行和第 189-223 行):
|
||||
|
||||
```python
|
||||
THRESHOLD = 50000
|
||||
KEEP_RECENT = 3
|
||||
|
||||
def micro_compact(messages):
|
||||
# Replace old tool results with placeholders
|
||||
...
|
||||
|
||||
def auto_compact(messages):
|
||||
# Save transcript, LLM summarize, replace messages
|
||||
...
|
||||
|
||||
def agent_loop(messages):
|
||||
while True:
|
||||
micro_compact(messages) # Layer 1
|
||||
if estimate_tokens(messages) > THRESHOLD:
|
||||
messages[:] = auto_compact(messages) # Layer 2
|
||||
response = client.messages.create(...)
|
||||
# ...
|
||||
if manual_compact:
|
||||
messages[:] = auto_compact(messages) # Layer 3
|
||||
```
|
||||
|
||||
## 相对 s05 的变更
|
||||
|
||||
| 组件 | 之前 (s05) | 之后 (s06) |
|
||||
|----------------|------------------|----------------------------|
|
||||
| Tools | 5 | 5 (基础 + compact) |
|
||||
| 上下文管理 | 无 | 三层压缩 |
|
||||
| Micro-compact | 无 | 旧结果 -> 占位符 |
|
||||
| Auto-compact | 无 | token 阈值触发 |
|
||||
| Manual compact | 无 | `compact` 工具 |
|
||||
| Transcripts | 无 | 保存到 .transcripts/ |
|
||||
| Skills | load_skill | 已移除 (非本节重点) |
|
||||
|
||||
## 设计原理
|
||||
|
||||
上下文窗口有限, 但智能体会话可以无限。三层压缩在不同粒度上解决这个问题: micro-compact (替换旧工具输出), auto-compact (接近限制时 LLM 摘要), manual compact (用户触发)。关键洞察是遗忘是特性而非缺陷 -- 它使无限会话成为可能。转录文件将完整历史保存在磁盘上, 因此没有任何东西真正丢失, 只是从活跃上下文中移出。分层方法让每一层在各自的粒度上独立运作, 从静默的逐轮清理到完整的对话重置。
|
||||
|
||||
## 试一试
|
||||
|
||||
```sh
|
||||
cd learn-claude-code
|
||||
python agents/s06_context_compact.py
|
||||
```
|
||||
|
||||
可以尝试的提示:
|
||||
|
||||
1. `Read every Python file in the agents/ directory one by one`
|
||||
(观察 micro-compact 替换旧的结果)
|
||||
2. `Keep reading files until compression triggers automatically`
|
||||
3. `Use the compact tool to manually compress the conversation`
|
||||
159
docs/zh/s07-task-system.md
Normal file
159
docs/zh/s07-task-system.md
Normal file
@@ -0,0 +1,159 @@
|
||||
# s07: Tasks (任务系统)
|
||||
|
||||
> 任务以 JSON 文件形式持久化在文件系统上, 带有依赖图, 因此它们能在上下文压缩后存活, 也可以跨智能体共享。
|
||||
|
||||
## 问题
|
||||
|
||||
内存中的状态 (如 s03 的 TodoManager) 在上下文压缩 (s06) 时会丢失。auto_compact 用摘要替换消息后, 待办列表就没了。智能体只能从摘要文本中重建它, 这是有损且容易出错的。
|
||||
|
||||
这就是 s06 到 s07 的关键桥梁: TodoManager 的条目随压缩消亡; 基于文件的任务不会。将状态移到文件系统上使其不受压缩影响。
|
||||
|
||||
更根本地说, 内存中的状态对其他智能体不可见。当我们最终构建团队 (s09+) 时, 队友需要一个共享的任务看板。内存中的数据结构是进程局部的。
|
||||
|
||||
解决方案是将任务作为 JSON 文件持久化在 `.tasks/` 目录中。每个任务是一个单独的文件, 包含 ID、主题、状态和依赖图。完成任务 1 会自动解除任务 2 的阻塞 (如果任务 2 有 `blockedBy: [1]`)。文件系统成为唯一的真实来源。
|
||||
|
||||
## 解决方案
|
||||
|
||||
```
|
||||
.tasks/
|
||||
task_1.json {"id":1, "status":"completed", ...}
|
||||
task_2.json {"id":2, "blockedBy":[1], "status":"pending"}
|
||||
task_3.json {"id":3, "blockedBy":[2], "status":"pending"}
|
||||
|
||||
Dependency resolution:
|
||||
+----------+ +----------+ +----------+
|
||||
| task 1 | --> | task 2 | --> | task 3 |
|
||||
| complete | | blocked | | blocked |
|
||||
+----------+ +----------+ +----------+
|
||||
| ^
|
||||
+--- completing task 1 removes it from
|
||||
task 2's blockedBy list
|
||||
```
|
||||
|
||||
## 工作原理
|
||||
|
||||
1. TaskManager 提供 CRUD 操作。每个任务是一个 JSON 文件。
|
||||
|
||||
```python
|
||||
class TaskManager:
|
||||
def create(self, subject: str, description: str = "") -> str:
|
||||
task = {
|
||||
"id": self._next_id,
|
||||
"subject": subject,
|
||||
"description": description,
|
||||
"status": "pending",
|
||||
"blockedBy": [],
|
||||
"blocks": [],
|
||||
"owner": "",
|
||||
}
|
||||
self._save(task)
|
||||
self._next_id += 1
|
||||
return json.dumps(task, indent=2)
|
||||
```
|
||||
|
||||
2. 当任务标记为 completed 时, `_clear_dependency` 将其 ID 从所有其他任务的 `blockedBy` 列表中移除。
|
||||
|
||||
```python
|
||||
def _clear_dependency(self, completed_id: int):
|
||||
for f in self.dir.glob("task_*.json"):
|
||||
task = json.loads(f.read_text())
|
||||
if completed_id in task.get("blockedBy", []):
|
||||
task["blockedBy"].remove(completed_id)
|
||||
self._save(task)
|
||||
```
|
||||
|
||||
3. `update` 方法处理状态变更和双向依赖关联。
|
||||
|
||||
```python
|
||||
def update(self, task_id, status=None,
|
||||
add_blocked_by=None, add_blocks=None):
|
||||
task = self._load(task_id)
|
||||
if status:
|
||||
task["status"] = status
|
||||
if status == "completed":
|
||||
self._clear_dependency(task_id)
|
||||
if add_blocks:
|
||||
task["blocks"] = list(set(task["blocks"] + add_blocks))
|
||||
for blocked_id in add_blocks:
|
||||
blocked = self._load(blocked_id)
|
||||
if task_id not in blocked["blockedBy"]:
|
||||
blocked["blockedBy"].append(task_id)
|
||||
self._save(blocked)
|
||||
self._save(task)
|
||||
```
|
||||
|
||||
4. 四个任务工具添加到 dispatch map。
|
||||
|
||||
```python
|
||||
TOOL_HANDLERS = {
|
||||
# ...base tools...
|
||||
"task_create": lambda **kw: TASKS.create(kw["subject"]),
|
||||
"task_update": lambda **kw: TASKS.update(kw["task_id"],
|
||||
kw.get("status")),
|
||||
"task_list": lambda **kw: TASKS.list_all(),
|
||||
"task_get": lambda **kw: TASKS.get(kw["task_id"]),
|
||||
}
|
||||
```
|
||||
|
||||
## 核心代码
|
||||
|
||||
带依赖图的 TaskManager (来自 `agents/s07_task_system.py`, 第 46-123 行):
|
||||
|
||||
```python
|
||||
class TaskManager:
|
||||
def __init__(self, tasks_dir: Path):
|
||||
self.dir = tasks_dir
|
||||
self.dir.mkdir(exist_ok=True)
|
||||
self._next_id = self._max_id() + 1
|
||||
|
||||
def _load(self, task_id: int) -> dict:
|
||||
path = self.dir / f"task_{task_id}.json"
|
||||
return json.loads(path.read_text())
|
||||
|
||||
def _save(self, task: dict):
|
||||
path = self.dir / f"task_{task['id']}.json"
|
||||
path.write_text(json.dumps(task, indent=2))
|
||||
|
||||
def create(self, subject, description=""):
|
||||
task = {"id": self._next_id, "subject": subject,
|
||||
"status": "pending", "blockedBy": [],
|
||||
"blocks": [], "owner": ""}
|
||||
self._save(task)
|
||||
self._next_id += 1
|
||||
return json.dumps(task, indent=2)
|
||||
|
||||
def _clear_dependency(self, completed_id):
|
||||
for f in self.dir.glob("task_*.json"):
|
||||
task = json.loads(f.read_text())
|
||||
if completed_id in task.get("blockedBy", []):
|
||||
task["blockedBy"].remove(completed_id)
|
||||
self._save(task)
|
||||
```
|
||||
|
||||
## 相对 s06 的变更
|
||||
|
||||
| 组件 | 之前 (s06) | 之后 (s07) |
|
||||
|----------------|------------------|----------------------------------|
|
||||
| Tools | 5 | 8 (+task_create/update/list/get) |
|
||||
| 状态存储 | 仅内存 | .tasks/ 中的 JSON 文件 |
|
||||
| 依赖关系 | 无 | blockedBy + blocks 图 |
|
||||
| 压缩机制 | 三层 | 已移除 (非本节重点) |
|
||||
| 持久化 | 压缩后丢失 | 压缩后存活 |
|
||||
|
||||
## 设计原理
|
||||
|
||||
基于文件的状态能在上下文压缩中存活。当智能体的对话被压缩时, 内存中的状态会丢失, 但写入磁盘的任务会持久保存。依赖图确保即使在上下文丢失后也能按正确顺序执行。这是临时对话与持久工作之间的桥梁 -- 智能体可以忘记对话细节, 但始终有任务看板来提醒它还需要做什么。文件系统作为唯一真实来源也为未来的多智能体共享提供了基础, 因为任何进程都可以读取相同的 JSON 文件。
|
||||
|
||||
## 试一试
|
||||
|
||||
```sh
|
||||
cd learn-claude-code
|
||||
python agents/s07_task_system.py
|
||||
```
|
||||
|
||||
可以尝试的提示:
|
||||
|
||||
1. `Create 3 tasks: "Setup project", "Write code", "Write tests". Make them depend on each other in order.`
|
||||
2. `List all tasks and show the dependency graph`
|
||||
3. `Complete task 1 and then list tasks to see task 2 unblocked`
|
||||
4. `Create a task board for refactoring: parse -> transform -> emit -> test`
|
||||
177
docs/zh/s08-background-tasks.md
Normal file
177
docs/zh/s08-background-tasks.md
Normal file
@@ -0,0 +1,177 @@
|
||||
# s08: Background Tasks (后台任务)
|
||||
|
||||
> BackgroundManager 在独立线程中运行命令, 在每次 LLM 调用前排空通知队列, 使智能体永远不会因长时间运行的操作而阻塞。
|
||||
|
||||
## 问题
|
||||
|
||||
有些命令需要几分钟: `npm install`、`pytest`、`docker build`。在阻塞式的 agent loop 中, 模型只能干等子进程结束, 什么也做不了。如果用户要求 "安装依赖, 同时创建配置文件", 智能体会先安装, 然后才创建配置 -- 串行执行, 而非并行。
|
||||
|
||||
智能体需要并发能力。不是将 agent loop 本身完全多线程化, 而是能够发起一个长时间命令然后继续工作。当命令完成时, 结果自然地出现在对话中。
|
||||
|
||||
解决方案是一个 BackgroundManager, 它在守护线程中运行命令, 将结果收集到通知队列中。每次 LLM 调用前, 队列被排空, 结果注入到消息中。
|
||||
|
||||
## 解决方案
|
||||
|
||||
```
|
||||
Main thread Background thread
|
||||
+-----------------+ +-----------------+
|
||||
| agent loop | | task executes |
|
||||
| ... | | ... |
|
||||
| [LLM call] <---+------- | enqueue(result) |
|
||||
| ^drain queue | +-----------------+
|
||||
+-----------------+
|
||||
|
||||
Timeline:
|
||||
Agent --[spawn A]--[spawn B]--[other work]----
|
||||
| |
|
||||
v v
|
||||
[A runs] [B runs] (parallel)
|
||||
| |
|
||||
+-- notification queue --+
|
||||
|
|
||||
[results injected before
|
||||
next LLM call]
|
||||
```
|
||||
|
||||
## 工作原理
|
||||
|
||||
1. BackgroundManager 追踪任务并维护一个线程安全的通知队列。
|
||||
|
||||
```python
|
||||
class BackgroundManager:
|
||||
def __init__(self):
|
||||
self.tasks = {}
|
||||
self._notification_queue = []
|
||||
self._lock = threading.Lock()
|
||||
```
|
||||
|
||||
2. `run()` 启动一个守护线程并立即返回 task_id。
|
||||
|
||||
```python
|
||||
def run(self, command: str) -> str:
|
||||
task_id = str(uuid.uuid4())[:8]
|
||||
self.tasks[task_id] = {
|
||||
"status": "running",
|
||||
"result": None,
|
||||
"command": command,
|
||||
}
|
||||
thread = threading.Thread(
|
||||
target=self._execute,
|
||||
args=(task_id, command),
|
||||
daemon=True,
|
||||
)
|
||||
thread.start()
|
||||
return f"Background task {task_id} started"
|
||||
```
|
||||
|
||||
3. 线程目标函数 `_execute` 运行子进程并将结果推入通知队列。
|
||||
|
||||
```python
|
||||
def _execute(self, task_id: str, command: str):
|
||||
try:
|
||||
r = subprocess.run(command, shell=True, cwd=WORKDIR,
|
||||
capture_output=True, text=True, timeout=300)
|
||||
output = (r.stdout + r.stderr).strip()[:50000]
|
||||
status = "completed"
|
||||
except subprocess.TimeoutExpired:
|
||||
output = "Error: Timeout (300s)"
|
||||
status = "timeout"
|
||||
self.tasks[task_id]["status"] = status
|
||||
self.tasks[task_id]["result"] = output
|
||||
with self._lock:
|
||||
self._notification_queue.append({
|
||||
"task_id": task_id,
|
||||
"status": status,
|
||||
"result": output[:500],
|
||||
})
|
||||
```
|
||||
|
||||
4. `drain_notifications()` 返回并清空待处理的结果。
|
||||
|
||||
```python
|
||||
def drain_notifications(self) -> list:
|
||||
with self._lock:
|
||||
notifs = list(self._notification_queue)
|
||||
self._notification_queue.clear()
|
||||
return notifs
|
||||
```
|
||||
|
||||
5. Agent loop 在每次 LLM 调用前排空通知。
|
||||
|
||||
```python
|
||||
def agent_loop(messages: list):
|
||||
while True:
|
||||
notifs = BG.drain_notifications()
|
||||
if notifs and messages:
|
||||
notif_text = "\n".join(
|
||||
f"[bg:{n['task_id']}] {n['status']}: "
|
||||
f"{n['result']}" for n in notifs
|
||||
)
|
||||
messages.append({"role": "user",
|
||||
"content": f"<background-results>"
|
||||
f"\n{notif_text}\n"
|
||||
f"</background-results>"})
|
||||
messages.append({"role": "assistant",
|
||||
"content": "Noted background results."})
|
||||
response = client.messages.create(...)
|
||||
```
|
||||
|
||||
## 核心代码
|
||||
|
||||
BackgroundManager (来自 `agents/s08_background_tasks.py`, 第 49-107 行):
|
||||
|
||||
```python
|
||||
class BackgroundManager:
|
||||
def __init__(self):
|
||||
self.tasks = {}
|
||||
self._notification_queue = []
|
||||
self._lock = threading.Lock()
|
||||
|
||||
def run(self, command: str) -> str:
|
||||
task_id = str(uuid.uuid4())[:8]
|
||||
self.tasks[task_id] = {"status": "running",
|
||||
"result": None,
|
||||
"command": command}
|
||||
thread = threading.Thread(
|
||||
target=self._execute,
|
||||
args=(task_id, command), daemon=True)
|
||||
thread.start()
|
||||
return f"Background task {task_id} started"
|
||||
|
||||
def _execute(self, task_id, command):
|
||||
# run subprocess, push to queue
|
||||
...
|
||||
|
||||
def drain_notifications(self) -> list:
|
||||
with self._lock:
|
||||
notifs = list(self._notification_queue)
|
||||
self._notification_queue.clear()
|
||||
return notifs
|
||||
```
|
||||
|
||||
## 相对 s07 的变更
|
||||
|
||||
| 组件 | 之前 (s07) | 之后 (s08) |
|
||||
|----------------|------------------|------------------------------------|
|
||||
| Tools | 8 | 6 (基础 + background_run + check) |
|
||||
| 执行方式 | 仅阻塞 | 阻塞 + 后台线程 |
|
||||
| 通知机制 | 无 | 每轮排空的队列 |
|
||||
| 并发 | 无 | 守护线程 |
|
||||
| 任务系统 | 基于文件的 CRUD | 已移除 (非本节重点) |
|
||||
|
||||
## 设计原理
|
||||
|
||||
智能体循环本质上是单线程的 (一次一个 LLM 调用)。后台线程为 I/O 密集型工作 (测试、构建、安装) 打破了这个限制。通知队列模式 ("在下一次 LLM 调用前排空") 确保结果在对话的自然间断点到达, 而不是打断模型的推理过程。这是一个最小化的并发模型: 智能体循环保持单线程和确定性, 只有 I/O 密集型的子进程执行被并行化。
|
||||
|
||||
## 试一试
|
||||
|
||||
```sh
|
||||
cd learn-claude-code
|
||||
python agents/s08_background_tasks.py
|
||||
```
|
||||
|
||||
可以尝试的提示:
|
||||
|
||||
1. `Run "sleep 5 && echo done" in the background, then create a file while it runs`
|
||||
2. `Start 3 background tasks: "sleep 2", "sleep 4", "sleep 6". Check their status.`
|
||||
3. `Run pytest in the background and keep working on other things`
|
||||
212
docs/zh/s09-agent-teams.md
Normal file
212
docs/zh/s09-agent-teams.md
Normal file
@@ -0,0 +1,212 @@
|
||||
# s09: Agent Teams (智能体团队)
|
||||
|
||||
> 持久化的队友通过 JSONL 收件箱将孤立的智能体转变为可通信的团队 -- spawn、message、broadcast 和 drain。
|
||||
|
||||
## 问题
|
||||
|
||||
子智能体 (s04) 是一次性的: 生成、工作、返回摘要、消亡。它们没有身份, 没有跨调用的记忆, 也无法接收后续指令。后台任务 (s08) 运行 shell 命令, 但不能做 LLM 引导的决策或交流发现。
|
||||
|
||||
真正的团队协作需要三样东西: (1) 存活时间超过单次 prompt 的持久化智能体, (2) 身份和生命周期管理, (3) 智能体之间的通信通道。没有消息机制, 即使持久化的队友也是又聋又哑的 -- 它们可以并行工作但永远无法协调。
|
||||
|
||||
解决方案将 TeammateManager (用于生成持久化的命名智能体) 与使用 JSONL 收件箱文件的 MessageBus 结合。每个队友在独立线程中运行自己的 agent loop, 每次 LLM 调用前检查收件箱, 可以向任何其他队友或领导发送消息。
|
||||
|
||||
关于 s06 到 s07 的桥梁: s03 的 TodoManager 条目随压缩 (s06) 消亡。基于文件的任务 (s07) 因为存储在磁盘上而能存活压缩。团队建立在同样的原则上 -- config.json 和收件箱文件持久化在上下文窗口之外。
|
||||
|
||||
## 解决方案
|
||||
|
||||
```
|
||||
Teammate lifecycle:
|
||||
spawn -> WORKING -> IDLE -> WORKING -> ... -> SHUTDOWN
|
||||
|
||||
Communication:
|
||||
.team/
|
||||
config.json <- team roster + statuses
|
||||
inbox/
|
||||
alice.jsonl <- append-only, drain-on-read
|
||||
bob.jsonl
|
||||
lead.jsonl
|
||||
|
||||
+--------+ send("alice","bob","...") +--------+
|
||||
| alice | -----------------------------> | bob |
|
||||
| loop | bob.jsonl << {json_line} | loop |
|
||||
+--------+ +--------+
|
||||
^ |
|
||||
| BUS.read_inbox("alice") |
|
||||
+---- alice.jsonl -> read + drain ---------+
|
||||
|
||||
5 message types:
|
||||
+-------------------------+------------------------------+
|
||||
| message | Normal text between agents |
|
||||
| broadcast | Sent to all teammates |
|
||||
| shutdown_request | Request graceful shutdown |
|
||||
| shutdown_response | Approve/reject shutdown |
|
||||
| plan_approval_response | Approve/reject plan |
|
||||
+-------------------------+------------------------------+
|
||||
```
|
||||
|
||||
## 工作原理
|
||||
|
||||
1. TeammateManager 通过 config.json 维护团队名册。每个成员有名称、角色和状态。
|
||||
|
||||
```python
|
||||
class TeammateManager:
|
||||
def __init__(self, team_dir: Path):
|
||||
self.dir = team_dir
|
||||
self.dir.mkdir(exist_ok=True)
|
||||
self.config_path = self.dir / "config.json"
|
||||
self.config = self._load_config()
|
||||
self.threads = {}
|
||||
```
|
||||
|
||||
2. `spawn()` 创建队友并在线程中启动其 agent loop。重新 spawn 一个 idle 状态的队友会将其重新激活。
|
||||
|
||||
```python
|
||||
def spawn(self, name: str, role: str, prompt: str) -> str:
|
||||
member = self._find_member(name)
|
||||
if member:
|
||||
if member["status"] not in ("idle", "shutdown"):
|
||||
return f"Error: '{name}' is currently {member['status']}"
|
||||
member["status"] = "working"
|
||||
else:
|
||||
member = {"name": name, "role": role, "status": "working"}
|
||||
self.config["members"].append(member)
|
||||
self._save_config()
|
||||
thread = threading.Thread(
|
||||
target=self._teammate_loop,
|
||||
args=(name, role, prompt), daemon=True)
|
||||
self.threads[name] = thread
|
||||
thread.start()
|
||||
return f"Spawned teammate '{name}' (role: {role})"
|
||||
```
|
||||
|
||||
3. MessageBus 处理 JSONL 收件箱文件。`send()` 追加一行 JSON; `read_inbox()` 读取所有行并清空文件。
|
||||
|
||||
```python
|
||||
class MessageBus:
|
||||
def send(self, sender, to, content,
|
||||
msg_type="message", extra=None):
|
||||
msg = {"type": msg_type, "from": sender,
|
||||
"content": content,
|
||||
"timestamp": time.time()}
|
||||
if extra:
|
||||
msg.update(extra)
|
||||
with open(self.dir / f"{to}.jsonl", "a") as f:
|
||||
f.write(json.dumps(msg) + "\n")
|
||||
return f"Sent {msg_type} to {to}"
|
||||
|
||||
def read_inbox(self, name):
|
||||
path = self.dir / f"{name}.jsonl"
|
||||
if not path.exists():
|
||||
return "[]"
|
||||
msgs = [json.loads(l)
|
||||
for l in path.read_text().strip().splitlines()
|
||||
if l]
|
||||
path.write_text("") # drain
|
||||
return json.dumps(msgs, indent=2)
|
||||
```
|
||||
|
||||
4. 每个队友在每次 LLM 调用前检查收件箱, 将收到的消息注入对话上下文。
|
||||
|
||||
```python
|
||||
def _teammate_loop(self, name, role, prompt):
|
||||
sys_prompt = f"You are '{name}', role: {role}, at {WORKDIR}."
|
||||
messages = [{"role": "user", "content": prompt}]
|
||||
for _ in range(50):
|
||||
inbox = BUS.read_inbox(name)
|
||||
if inbox != "[]":
|
||||
messages.append({"role": "user",
|
||||
"content": f"<inbox>{inbox}</inbox>"})
|
||||
messages.append({"role": "assistant",
|
||||
"content": "Noted inbox messages."})
|
||||
response = client.messages.create(
|
||||
model=MODEL, system=sys_prompt,
|
||||
messages=messages, tools=TOOLS)
|
||||
messages.append({"role": "assistant",
|
||||
"content": response.content})
|
||||
if response.stop_reason != "tool_use":
|
||||
break
|
||||
# execute tools, append results...
|
||||
self._find_member(name)["status"] = "idle"
|
||||
self._save_config()
|
||||
```
|
||||
|
||||
5. `broadcast()` 向除发送者外的所有队友发送相同消息。
|
||||
|
||||
```python
|
||||
def broadcast(self, sender, content, teammates):
|
||||
count = 0
|
||||
for name in teammates:
|
||||
if name != sender:
|
||||
self.send(sender, name, content, "broadcast")
|
||||
count += 1
|
||||
return f"Broadcast to {count} teammates"
|
||||
```
|
||||
|
||||
## 核心代码
|
||||
|
||||
TeammateManager + MessageBus 核心 (来自 `agents/s09_agent_teams.py`):
|
||||
|
||||
```python
|
||||
class TeammateManager:
|
||||
def spawn(self, name, role, prompt):
|
||||
member = self._find_member(name) or {
|
||||
"name": name, "role": role, "status": "working"
|
||||
}
|
||||
member["status"] = "working"
|
||||
self._save_config()
|
||||
thread = threading.Thread(
|
||||
target=self._teammate_loop,
|
||||
args=(name, role, prompt), daemon=True)
|
||||
thread.start()
|
||||
return f"Spawned '{name}'"
|
||||
|
||||
class MessageBus:
|
||||
def send(self, sender, to, content,
|
||||
msg_type="message", extra=None):
|
||||
msg = {"type": msg_type, "from": sender,
|
||||
"content": content, "timestamp": time.time()}
|
||||
if extra: msg.update(extra)
|
||||
with open(self.dir / f"{to}.jsonl", "a") as f:
|
||||
f.write(json.dumps(msg) + "\n")
|
||||
|
||||
def read_inbox(self, name):
|
||||
path = self.dir / f"{name}.jsonl"
|
||||
if not path.exists(): return "[]"
|
||||
msgs = [json.loads(l)
|
||||
for l in path.read_text().strip().splitlines()
|
||||
if l]
|
||||
path.write_text("")
|
||||
return json.dumps(msgs, indent=2)
|
||||
```
|
||||
|
||||
## 相对 s08 的变更
|
||||
|
||||
| 组件 | 之前 (s08) | 之后 (s09) |
|
||||
|----------------|------------------|------------------------------------|
|
||||
| Tools | 6 | 9 (+spawn/send/read_inbox) |
|
||||
| 智能体数量 | 单一 | 领导 + N 个队友 |
|
||||
| 持久化 | 无 | config.json + JSONL 收件箱 |
|
||||
| 线程 | 后台命令 | 每线程完整 agent loop |
|
||||
| 生命周期 | 一次性 | idle -> working -> idle |
|
||||
| 通信 | 无 | 5 种消息类型 + broadcast |
|
||||
|
||||
教学简化说明: 此实现未使用文件锁来保护收件箱访问。在生产中, 多个写入者并发追加需要文件锁或原子重命名。这里使用的单写入者-per-收件箱模式在教学场景下是安全的。
|
||||
|
||||
## 设计原理
|
||||
|
||||
基于文件的邮箱 (追加式 JSONL) 提供了并发安全的智能体间通信。追加操作在大多数文件系统上是原子的, 避免了锁竞争。"读取时排空" 模式 (读取全部, 截断) 提供批量传递。这比共享内存或基于 socket 的 IPC 更简单、更健壮。代价是延迟 -- 消息只在下一次轮询时才被看到 -- 但对于每轮需要数秒推理时间的 LLM 驱动智能体来说, 轮询延迟相比推理时间可以忽略不计。
|
||||
|
||||
## 试一试
|
||||
|
||||
```sh
|
||||
cd learn-claude-code
|
||||
python agents/s09_agent_teams.py
|
||||
```
|
||||
|
||||
可以尝试的提示:
|
||||
|
||||
1. `Spawn alice (coder) and bob (tester). Have alice send bob a message.`
|
||||
2. `Broadcast "status update: phase 1 complete" to all teammates`
|
||||
3. `Check the lead inbox for any messages`
|
||||
4. 输入 `/team` 查看带状态的团队名册
|
||||
5. 输入 `/inbox` 手动检查领导的收件箱
|
||||
190
docs/zh/s10-team-protocols.md
Normal file
190
docs/zh/s10-team-protocols.md
Normal file
@@ -0,0 +1,190 @@
|
||||
# s10: Team Protocols (团队协议)
|
||||
|
||||
> 同一个 request_id 握手模式驱动了关机和计划审批两种协议 -- 一个 FSM, 两种应用。
|
||||
|
||||
## 问题
|
||||
|
||||
在 s09 中, 队友可以工作和通信, 但没有结构化的协调。出现了两个问题:
|
||||
|
||||
**关机**: 如何干净地停止一个队友? 直接杀线程会留下写了一半的文件和错误状态的 config.json。优雅关机需要握手: 领导发起请求, 队友决定是批准 (完成并退出) 还是拒绝 (继续工作)。
|
||||
|
||||
**计划审批**: 如何控制执行门槛? 当领导说 "重构认证模块", 队友会立即开始。对于高风险变更, 领导应该在执行开始前审查计划。初级提出方案, 高级批准。
|
||||
|
||||
两个问题共享相同的结构: 一方发送带唯一 ID 的请求, 另一方引用该 ID 作出响应。一个有限状态机 (FSM) 跟踪每个请求经历 pending -> approved | rejected 的状态变迁。
|
||||
|
||||
## 解决方案
|
||||
|
||||
```
|
||||
Shutdown Protocol Plan Approval Protocol
|
||||
================== ======================
|
||||
|
||||
Lead Teammate Teammate Lead
|
||||
| | | |
|
||||
|--shutdown_req-->| |--plan_req------>|
|
||||
| {req_id:"abc"} | | {req_id:"xyz"} |
|
||||
| | | |
|
||||
|<--shutdown_resp-| |<--plan_resp-----|
|
||||
| {req_id:"abc", | | {req_id:"xyz", |
|
||||
| approve:true} | | approve:true} |
|
||||
| | | |
|
||||
v v v v
|
||||
tracker["abc"] exits proceeds tracker["xyz"]
|
||||
= approved = approved
|
||||
|
||||
Shared FSM (identical for both protocols):
|
||||
[pending] --approve--> [approved]
|
||||
[pending] --reject---> [rejected]
|
||||
|
||||
Trackers:
|
||||
shutdown_requests = {req_id: {target, status}}
|
||||
plan_requests = {req_id: {from, plan, status}}
|
||||
```
|
||||
|
||||
## 工作原理
|
||||
|
||||
1. 领导通过生成 request_id 并通过收件箱发送 shutdown_request 来发起关机。
|
||||
|
||||
```python
|
||||
shutdown_requests = {}
|
||||
|
||||
def handle_shutdown_request(teammate: str) -> str:
|
||||
req_id = str(uuid.uuid4())[:8]
|
||||
shutdown_requests[req_id] = {
|
||||
"target": teammate, "status": "pending",
|
||||
}
|
||||
BUS.send("lead", teammate, "Please shut down gracefully.",
|
||||
"shutdown_request", {"request_id": req_id})
|
||||
return f"Shutdown request {req_id} sent (status: pending)"
|
||||
```
|
||||
|
||||
2. 队友在收件箱中收到请求, 调用 `shutdown_response` 工具来批准或拒绝。
|
||||
|
||||
```python
|
||||
if tool_name == "shutdown_response":
|
||||
req_id = args["request_id"]
|
||||
approve = args["approve"]
|
||||
if req_id in shutdown_requests:
|
||||
shutdown_requests[req_id]["status"] = \
|
||||
"approved" if approve else "rejected"
|
||||
BUS.send(sender, "lead", args.get("reason", ""),
|
||||
"shutdown_response",
|
||||
{"request_id": req_id, "approve": approve})
|
||||
return f"Shutdown {'approved' if approve else 'rejected'}"
|
||||
```
|
||||
|
||||
3. 队友的循环检查是否批准了关机并退出。
|
||||
|
||||
```python
|
||||
if (block.name == "shutdown_response"
|
||||
and block.input.get("approve")):
|
||||
should_exit = True
|
||||
# ...
|
||||
member["status"] = "shutdown" if should_exit else "idle"
|
||||
```
|
||||
|
||||
4. 计划审批遵循完全相同的模式。队友提交计划时生成一个 request_id。
|
||||
|
||||
```python
|
||||
plan_requests = {}
|
||||
|
||||
if tool_name == "plan_approval":
|
||||
plan_text = args.get("plan", "")
|
||||
req_id = str(uuid.uuid4())[:8]
|
||||
plan_requests[req_id] = {
|
||||
"from": sender, "plan": plan_text,
|
||||
"status": "pending",
|
||||
}
|
||||
BUS.send(sender, "lead", plan_text,
|
||||
"plan_approval_request",
|
||||
{"request_id": req_id, "plan": plan_text})
|
||||
return f"Plan submitted (request_id={req_id})"
|
||||
```
|
||||
|
||||
5. 领导审查后使用同一个 request_id 作出响应。
|
||||
|
||||
```python
|
||||
def handle_plan_review(request_id, approve, feedback=""):
|
||||
req = plan_requests.get(request_id)
|
||||
if not req:
|
||||
return f"Error: Unknown request_id '{request_id}'"
|
||||
req["status"] = "approved" if approve else "rejected"
|
||||
BUS.send("lead", req["from"], feedback,
|
||||
"plan_approval_response",
|
||||
{"request_id": request_id,
|
||||
"approve": approve,
|
||||
"feedback": feedback})
|
||||
return f"Plan {req['status']} for '{req['from']}'"
|
||||
```
|
||||
|
||||
6. 两个协议使用同一个 `plan_approval` 工具名, 有两种模式: 队友提交 (无 request_id), 领导审查 (带 request_id)。
|
||||
|
||||
```python
|
||||
# Lead tool dispatch:
|
||||
"plan_approval": lambda **kw: handle_plan_review(
|
||||
kw["request_id"], kw["approve"],
|
||||
kw.get("feedback", "")),
|
||||
# Teammate: submit mode (generate request_id)
|
||||
```
|
||||
|
||||
## 核心代码
|
||||
|
||||
双协议处理器 (来自 `agents/s10_team_protocols.py`):
|
||||
|
||||
```python
|
||||
shutdown_requests = {}
|
||||
plan_requests = {}
|
||||
|
||||
# -- Shutdown --
|
||||
def handle_shutdown_request(teammate):
|
||||
req_id = str(uuid.uuid4())[:8]
|
||||
shutdown_requests[req_id] = {
|
||||
"target": teammate, "status": "pending"
|
||||
}
|
||||
BUS.send("lead", teammate,
|
||||
"Please shut down gracefully.",
|
||||
"shutdown_request",
|
||||
{"request_id": req_id})
|
||||
|
||||
# -- Plan Approval --
|
||||
def handle_plan_review(request_id, approve, feedback=""):
|
||||
req = plan_requests[request_id]
|
||||
req["status"] = "approved" if approve else "rejected"
|
||||
BUS.send("lead", req["from"], feedback,
|
||||
"plan_approval_response",
|
||||
{"request_id": request_id,
|
||||
"approve": approve})
|
||||
|
||||
# Both use the same FSM:
|
||||
# pending -> approved | rejected
|
||||
# Both correlate by request_id across async inboxes
|
||||
```
|
||||
|
||||
## 相对 s09 的变更
|
||||
|
||||
| 组件 | 之前 (s09) | 之后 (s10) |
|
||||
|----------------|------------------|--------------------------------------|
|
||||
| Tools | 9 | 12 (+shutdown_req/resp +plan) |
|
||||
| 关机 | 仅自然退出 | 请求-响应握手 |
|
||||
| 计划门控 | 无 | 提交/审查与审批 |
|
||||
| 请求追踪 | 无 | 两个 tracker 字典 |
|
||||
| 关联 | 无 | 每个请求一个 request_id |
|
||||
| FSM | 无 | pending -> approved/rejected |
|
||||
|
||||
## 设计原理
|
||||
|
||||
request_id 关联模式将任何异步交互转化为可追踪的有限状态机。同一个三状态机 (pending -> approved/rejected) 适用于关机、计划审批或任何未来的协议。这就是为什么一个模式能处理多种协议 -- FSM 不关心它在审批什么。request_id 在异步收件箱中提供关联, 消息可能乱序到达, 使该模式对智能体间的时序差异具有鲁棒性。
|
||||
|
||||
## 试一试
|
||||
|
||||
```sh
|
||||
cd learn-claude-code
|
||||
python agents/s10_team_protocols.py
|
||||
```
|
||||
|
||||
可以尝试的提示:
|
||||
|
||||
1. `Spawn alice as a coder. Then request her shutdown.`
|
||||
2. `List teammates to see alice's status after shutdown approval`
|
||||
3. `Spawn bob with a risky refactoring task. Review and reject his plan.`
|
||||
4. `Spawn charlie, have him submit a plan, then approve it.`
|
||||
5. 输入 `/team` 监控状态
|
||||
215
docs/zh/s11-autonomous-agents.md
Normal file
215
docs/zh/s11-autonomous-agents.md
Normal file
@@ -0,0 +1,215 @@
|
||||
# s11: Autonomous Agents (自治智能体)
|
||||
|
||||
> 带任务看板轮询的空闲循环让队友能自己发现和认领工作, 上下文压缩后通过身份重注入保持角色认知。
|
||||
|
||||
## 问题
|
||||
|
||||
在 s09-s10 中, 队友只在被明确指示时才工作。领导必须用特定的 prompt 生成每个队友。如果任务看板上有 10 个未认领的任务, 领导必须手动分配每一个。这无法扩展。
|
||||
|
||||
真正的自治意味着队友自己寻找工作。当一个队友完成当前任务后, 它应该扫描任务看板寻找未认领的工作, 认领一个任务, 然后开始工作 -- 不需要领导的任何指令。
|
||||
|
||||
但自治智能体面临一个微妙问题: 上下文压缩后, 智能体可能忘记自己是谁。如果消息被摘要化, 原始系统提示中的身份 ("你是 alice, 角色: coder") 就会淡化。身份重注入通过在压缩后的上下文开头插入身份块来解决这个问题。
|
||||
|
||||
教学简化说明: 这里的 token 估算比较粗糙 (字符数 / 4)。生产系统使用专业的 tokenizer 库。s03 中的 nag 阈值 3 轮是为教学可见性设的低值; 生产环境的智能体通常使用约 10 轮的阈值。
|
||||
|
||||
## 解决方案
|
||||
|
||||
```
|
||||
Teammate lifecycle with idle cycle:
|
||||
|
||||
+-------+
|
||||
| spawn |
|
||||
+---+---+
|
||||
|
|
||||
v
|
||||
+-------+ tool_use +-------+
|
||||
| WORK | <------------- | LLM |
|
||||
+---+---+ +-------+
|
||||
|
|
||||
| stop_reason != tool_use
|
||||
| (or idle tool called)
|
||||
v
|
||||
+--------+
|
||||
| IDLE | poll every 5s for up to 60s
|
||||
+---+----+
|
||||
|
|
||||
+---> check inbox --> message? ----------> WORK
|
||||
|
|
||||
+---> scan .tasks/ --> unclaimed? -------> claim -> WORK
|
||||
|
|
||||
+---> 60s timeout ----------------------> SHUTDOWN
|
||||
|
||||
Identity re-injection after compression:
|
||||
if len(messages) <= 3:
|
||||
messages.insert(0, identity_block)
|
||||
"You are 'alice', role: coder, team: my-team"
|
||||
```
|
||||
|
||||
## 工作原理
|
||||
|
||||
1. 队友循环有两个阶段: WORK 和 IDLE。WORK 阶段运行标准的 agent loop。当 LLM 停止调用工具 (或调用了 `idle` 工具) 时, 队友进入 IDLE 阶段。
|
||||
|
||||
```python
|
||||
def _loop(self, name, role, prompt):
|
||||
while True:
|
||||
# -- WORK PHASE --
|
||||
messages = [{"role": "user", "content": prompt}]
|
||||
for _ in range(50):
|
||||
inbox = BUS.read_inbox(name)
|
||||
for msg in inbox:
|
||||
if msg.get("type") == "shutdown_request":
|
||||
self._set_status(name, "shutdown")
|
||||
return
|
||||
messages.append(...)
|
||||
response = client.messages.create(...)
|
||||
if response.stop_reason != "tool_use":
|
||||
break
|
||||
# execute tools...
|
||||
if idle_requested:
|
||||
break
|
||||
|
||||
# -- IDLE PHASE --
|
||||
self._set_status(name, "idle")
|
||||
resume = self._idle_poll(name, messages)
|
||||
if not resume:
|
||||
self._set_status(name, "shutdown")
|
||||
return
|
||||
self._set_status(name, "working")
|
||||
```
|
||||
|
||||
2. 空闲阶段循环轮询收件箱和任务看板。
|
||||
|
||||
```python
|
||||
def _idle_poll(self, name, messages):
|
||||
polls = IDLE_TIMEOUT // POLL_INTERVAL # 60s / 5s = 12
|
||||
for _ in range(polls):
|
||||
time.sleep(POLL_INTERVAL)
|
||||
# Check inbox for new messages
|
||||
inbox = BUS.read_inbox(name)
|
||||
if inbox:
|
||||
messages.append({"role": "user",
|
||||
"content": f"<inbox>{inbox}</inbox>"})
|
||||
return True
|
||||
# Scan task board for unclaimed tasks
|
||||
unclaimed = scan_unclaimed_tasks()
|
||||
if unclaimed:
|
||||
task = unclaimed[0]
|
||||
claim_task(task["id"], name)
|
||||
messages.append({"role": "user",
|
||||
"content": f"<auto-claimed>Task #{task['id']}: "
|
||||
f"{task['subject']}</auto-claimed>"})
|
||||
return True
|
||||
return False # timeout -> shutdown
|
||||
```
|
||||
|
||||
3. 任务看板扫描查找 pending 状态、无 owner、未被阻塞的任务。
|
||||
|
||||
```python
|
||||
def scan_unclaimed_tasks() -> list:
|
||||
TASKS_DIR.mkdir(exist_ok=True)
|
||||
unclaimed = []
|
||||
for f in sorted(TASKS_DIR.glob("task_*.json")):
|
||||
task = json.loads(f.read_text())
|
||||
if (task.get("status") == "pending"
|
||||
and not task.get("owner")
|
||||
and not task.get("blockedBy")):
|
||||
unclaimed.append(task)
|
||||
return unclaimed
|
||||
|
||||
def claim_task(task_id: int, owner: str):
|
||||
path = TASKS_DIR / f"task_{task_id}.json"
|
||||
task = json.loads(path.read_text())
|
||||
task["status"] = "in_progress"
|
||||
task["owner"] = owner
|
||||
path.write_text(json.dumps(task, indent=2))
|
||||
```
|
||||
|
||||
4. 身份重注入: 当上下文过短时插入身份块, 表明发生了压缩。
|
||||
|
||||
```python
|
||||
def make_identity_block(name, role, team_name):
|
||||
return {"role": "user",
|
||||
"content": f"<identity>You are '{name}', "
|
||||
f"role: {role}, team: {team_name}. "
|
||||
f"Continue your work.</identity>"}
|
||||
|
||||
# Before resuming work after idle:
|
||||
if len(messages) <= 3:
|
||||
messages.insert(0, make_identity_block(
|
||||
name, role, team_name))
|
||||
messages.insert(1, {"role": "assistant",
|
||||
"content": f"I am {name}. Continuing."})
|
||||
```
|
||||
|
||||
5. `idle` 工具让队友显式地表示没有更多工作, 提前进入空闲轮询阶段。
|
||||
|
||||
```python
|
||||
{"name": "idle",
|
||||
"description": "Signal that you have no more work. "
|
||||
"Enters idle polling phase.",
|
||||
"input_schema": {"type": "object", "properties": {}}},
|
||||
```
|
||||
|
||||
## 核心代码
|
||||
|
||||
自治循环 (来自 `agents/s11_autonomous_agents.py`):
|
||||
|
||||
```python
|
||||
def _loop(self, name, role, prompt):
|
||||
while True:
|
||||
# WORK PHASE
|
||||
for _ in range(50):
|
||||
response = client.messages.create(...)
|
||||
if response.stop_reason != "tool_use":
|
||||
break
|
||||
for block in response.content:
|
||||
if block.name == "idle":
|
||||
idle_requested = True
|
||||
if idle_requested:
|
||||
break
|
||||
|
||||
# IDLE PHASE
|
||||
self._set_status(name, "idle")
|
||||
for _ in range(IDLE_TIMEOUT // POLL_INTERVAL):
|
||||
time.sleep(POLL_INTERVAL)
|
||||
inbox = BUS.read_inbox(name)
|
||||
if inbox: resume = True; break
|
||||
unclaimed = scan_unclaimed_tasks()
|
||||
if unclaimed:
|
||||
claim_task(unclaimed[0]["id"], name)
|
||||
resume = True; break
|
||||
if not resume:
|
||||
self._set_status(name, "shutdown")
|
||||
return
|
||||
self._set_status(name, "working")
|
||||
```
|
||||
|
||||
## 相对 s10 的变更
|
||||
|
||||
| 组件 | 之前 (s10) | 之后 (s11) |
|
||||
|----------------|------------------|----------------------------------|
|
||||
| Tools | 12 | 14 (+idle, +claim_task) |
|
||||
| 自治性 | 领导指派 | 自组织 |
|
||||
| 空闲阶段 | 无 | 轮询收件箱 + 任务看板 |
|
||||
| 任务认领 | 仅手动 | 自动认领未认领任务 |
|
||||
| 身份 | 系统提示 | + 压缩后重注入 |
|
||||
| 超时 | 无 | 60 秒空闲 -> 自动关机 |
|
||||
|
||||
## 设计原理
|
||||
|
||||
轮询 + 超时使智能体无需中央协调器即可自组织。每个智能体独立轮询任务看板, 认领未认领的工作, 完成后回到空闲状态。超时触发轮询循环, 如果在窗口期内没有工作出现, 智能体自行关机。这与工作窃取线程池的模式相同 -- 分布式, 无单点故障。压缩后的身份重注入确保智能体即使在对话历史被摘要后仍能保持其角色。
|
||||
|
||||
## 试一试
|
||||
|
||||
```sh
|
||||
cd learn-claude-code
|
||||
python agents/s11_autonomous_agents.py
|
||||
```
|
||||
|
||||
可以尝试的提示:
|
||||
|
||||
1. `Create 3 tasks on the board, then spawn alice and bob. Watch them auto-claim.`
|
||||
2. `Spawn a coder teammate and let it find work from the task board itself`
|
||||
3. `Create tasks with dependencies. Watch teammates respect the blocked order.`
|
||||
4. 输入 `/tasks` 查看带 owner 的任务看板
|
||||
5. 输入 `/team` 监控谁在工作、谁在空闲
|
||||
Reference in New Issue
Block a user