mirror of
https://github.com/shareAI-lab/analysis_claude_code.git
synced 2026-06-20 20:23:36 +08:00
* feat: s01-s14 docs quality overhaul — tool pipeline, single-agent, knowledge & resilience Rewrite code.py and README (zh/en/ja) for s01-s14, each chapter building incrementally on the previous. Key fixes across chapters: - s01-s04: agent loop, tool dispatch, permission pipeline, hooks - s05-s08: todo write, subagent, skill loading, context compact - s09-s11: memory system, system prompt assembly, error recovery - s12-s14: task graph, background tasks, cron scheduler All chapters CC source-verified. Code inherits fixes forward (PROMPT_SECTIONS, json.dumps cache, real-state context, can_start dep protection, etc.). * feat: s15-s19 docs quality overhaul — multi-agent platform: teams, protocols, autonomy, worktree, MCP tools Rewrite code.py and README (zh/en/ja) for s15-s19, the multi-agent platform chapters. Each chapter inherits all previous fixes and adds one mechanism: - s15: agent teams (TeamCreate, teammate threads, shared task list) - s16: team protocols (plan approval, shutdown handshake, consume_inbox) - s17: autonomous agents (idle polling, auto-claim, consume_lead_inbox) - s18: worktree isolation (git worktree, bind_task, cwd switching, safety) - s19: MCP tools (MCPClient, normalize_mcp_name, assemble_tool_pool, no cache) All appendix source code references verified against CC source. Config priority corrected: claude.ai < plugin < user < project < local. * fix: 5 regressions across s05-s19 — glob safety, todo validation, memory extraction, protocol types, dep crash - s05-s09: glob results now filter with is_relative_to(WORKDIR) (inherited from s02) - s06-s08: todo_write validates content/status required fields (inherited from s05) - s09: extract_memories uses pre-compression snapshot instead of compacted messages - s16: submit_plan docstring clarifies protocol-only (not code-level gate) - s17-s19: match_response restores type mismatch validation (from s16) - s17-s19: claim_task deps list handles missing dep files without crashing * fix: s12 Todo V2 logic reversal, s14/s15 cron range validation, s18/s19 worktree name validation - s12 README (zh/en/ja): fix Todo V2 direction — interactive defaults to Task, non-interactive/SDK defaults to TodoWrite. Fix env var name to CLAUDE_CODE_ENABLE_TASKS (not TODO_V2). - s14/s15: add _validate_cron_field with per-field range checks (minute 0-59, hour 0-23, dom 1-31, month 1-12, dow 0-6), step > 0, range lo <= hi. Replace old try/except validation that only caught exceptions. - s18/s19: add validate_worktree_name() to remove_worktree and keep_worktree, not just create_worktree. * fix: align s16-s19 teaching tool consistency * fix pr265 chapter diagrams * Add comprehensive s20 harness chapter * Fix chapter smoke test regressions * Clarify README tutorial track transition --------- Co-authored-by: Haoran <bill-billion@outlook.com>
255 lines
10 KiB
Markdown
255 lines
10 KiB
Markdown
# s10: System Prompt — Assembled at Runtime, Never Hardcoded
|
|
|
|
[中文](README.md) · [English](README.en.md) · [日本語](README.ja.md)
|
|
|
|
s01 → ... → s08 → s09 → `s10` → [s11](../s11_error_recovery/) → s12 → ... → s20
|
|
> *"prompt is assembled, not hardcoded"* — Sections + on-demand assembly + caching.
|
|
>
|
|
> **Harness Layer**: Prompt — assembled at runtime, never hardcoded.
|
|
|
|
---
|
|
|
|
## The Problem
|
|
|
|
From s01 to s09, the system prompt was always one hardcoded line:
|
|
|
|
```python
|
|
SYSTEM = f"You are a coding agent at {WORKDIR}. Use tools to solve tasks."
|
|
```
|
|
|
|
That worked for s01 — only bash, read, write. But by s09, the agent has memory, compression, skill loading. The prompt needs to describe more and more capabilities:
|
|
|
|
```python
|
|
SYSTEM = (
|
|
f"You are a coding agent at {WORKDIR}. "
|
|
"Use tools to solve tasks. Act, don't explain. "
|
|
"Before starting any multi-step task, use todo_write. "
|
|
"Skills are available via list_skills and load_skill. "
|
|
"Relevant memories are injected below when available. "
|
|
# ... add a capability, add a line
|
|
)
|
|
```
|
|
|
|
Three problems:
|
|
|
|
1. **Switching projects requires rewriting the entire prompt** — no way to know what to change and what to keep
|
|
2. **One change can break others** — adding a tool description might conflict with earlier instructions
|
|
3. **Every request carries everything** — even when the current conversation doesn't need certain sections, they waste tokens
|
|
|
|
The system prompt should be a configuration assembled at runtime based on current state: which tools are enabled, which context is visible, which memories are relevant, and which content must remain stable to hit prompt cache.
|
|
|
|
---
|
|
|
|
## The Solution
|
|
|
|

|
|
|
|
s10 focuses on prompt assembly. It builds on the s08-s09 capabilities but doesn't re-implement compression or memory. The core change: split the hardcoded `SYSTEM` into independent sections, assemble them at runtime based on real state, and cache the result.
|
|
|
|
Four sections, two loading strategies:
|
|
|
|
| Section | Strategy | Content | Condition |
|
|
|---------|----------|---------|-----------|
|
|
| identity | always | who you are, how to work | always present |
|
|
| tools | always | available tool list | `enabled_tools` |
|
|
| workspace | always | working directory | always present |
|
|
| memory | on-demand | relevant memory content | whether `.memory/MEMORY.md` exists |
|
|
|
|
Key design: whether a section loads depends on real state (tools exist, files exist), not keywords in messages.
|
|
|
|
---
|
|
|
|
## How It Works
|
|
|
|
### PROMPT_SECTIONS: Topic-Keyed Fragments
|
|
|
|
Split the monolithic string into a dictionary, each key is a topic:
|
|
|
|
```python
|
|
PROMPT_SECTIONS = {
|
|
"identity": "You are a coding agent. Act, don't explain.",
|
|
"tools": "Available tools: bash, read_file, write_file.",
|
|
"workspace": f"Working directory: {WORKDIR}",
|
|
"memory": "Relevant memories are injected below when available.",
|
|
}
|
|
```
|
|
|
|
Each section is maintained independently. Changing `tools` doesn't affect `identity`; adding `memory` doesn't touch `workspace`.
|
|
|
|
### assemble_system_prompt: On-Demand Assembly
|
|
|
|
Not every section is needed every turn. No memory files? Loading the memory section just wastes tokens. Assembly is based on real state in context:
|
|
|
|
```python
|
|
def assemble_system_prompt(context: dict) -> str:
|
|
sections = []
|
|
|
|
# Always loaded
|
|
sections.append(PROMPT_SECTIONS["identity"])
|
|
sections.append(PROMPT_SECTIONS["tools"])
|
|
sections.append(PROMPT_SECTIONS["workspace"])
|
|
|
|
# On-demand — based on real state, not keywords
|
|
memories = context.get("memories", "")
|
|
if memories:
|
|
sections.append(f"Relevant memories:\n{memories}")
|
|
|
|
return "\n\n".join(sections)
|
|
```
|
|
|
|
"Always loaded" sections are needed every turn: identity, tools, workspace. "On-demand" sections are only useful under specific conditions.
|
|
|
|
Why not load everything? Tokens have cost (system prompt is billed every turn), and fewer instructions means more focused output (irrelevant instructions are noise).
|
|
|
|
### get_system_prompt: Cache to Avoid Re-Assembly
|
|
|
|
When context hasn't changed (multiple LLM calls in the same turn with the same context), re-assembling is wasteful. Use deterministic serialization to detect changes and return cached result:
|
|
|
|
```python
|
|
def get_system_prompt(context: dict) -> str:
|
|
global _last_context_key, _last_prompt
|
|
key = json.dumps(context, sort_keys=True, ensure_ascii=False, default=str)
|
|
if key == _last_context_key and _last_prompt:
|
|
return _last_prompt
|
|
_last_context_key = key
|
|
_last_prompt = assemble_system_prompt(context)
|
|
return _last_prompt
|
|
```
|
|
|
|
`json.dumps` instead of `hash()`: Python's built-in `hash()` has process randomization (unsuitable for stable cache keys) and throws `unhashable type` on nested dicts/lists.
|
|
|
|
Note: this cache only avoids redundant string assembly within a process. It's not the same as CC's API prompt cache, which uses `SYSTEM_PROMPT_DYNAMIC_BOUNDARY` to separate static and dynamic parts — the static parts hit global cache and don't invalidate when dynamic content changes.
|
|
|
|
### context: Real State, Not Keyword Guessing
|
|
|
|
Context reflects the actual runtime state:
|
|
|
|
```python
|
|
def update_context(context: dict, messages: list) -> dict:
|
|
memories = ""
|
|
if MEMORY_INDEX.exists():
|
|
content = MEMORY_INDEX.read_text().strip()
|
|
if content:
|
|
memories = content
|
|
return {
|
|
"enabled_tools": list(TOOL_HANDLERS.keys()),
|
|
"workspace": str(WORKDIR),
|
|
"memories": memories,
|
|
}
|
|
```
|
|
|
|
`enabled_tools` lists actually registered tools. `memories` checks whether `.memory/MEMORY.md` exists. Section loading is based on this real state, not searching for keywords in messages.
|
|
|
|
### Putting It Together
|
|
|
|
```python
|
|
def agent_loop(messages: list, context: dict):
|
|
system = get_system_prompt(context)
|
|
while True:
|
|
response = client.messages.create(
|
|
model=MODEL, system=system, messages=messages,
|
|
tools=TOOLS, max_tokens=8000)
|
|
# ... tool execution ...
|
|
context = update_context(context, messages)
|
|
system = get_system_prompt(context)
|
|
```
|
|
|
|
At the start of each loop iteration, get the system prompt. If context changed, re-assemble; if not, return cached version.
|
|
|
|
---
|
|
|
|
## Changes From s09
|
|
|
|
| Component | Before (s09) | After (s10) |
|
|
|-----------|-------------|-------------|
|
|
| prompt | Hardcoded SYSTEM string | PROMPT_SECTIONS + assemble_system_prompt |
|
|
| caching | None | get_system_prompt (json.dumps detection + cache) |
|
|
| new functions | — | assemble_system_prompt, get_system_prompt, update_context |
|
|
| tools | bash, read_file, write_file (3) | bash, read_file, write_file (3) — unchanged |
|
|
| loop | Uses fixed SYSTEM | Uses get_system_prompt(context) |
|
|
|
|
---
|
|
|
|
## Try It
|
|
|
|
```sh
|
|
cd learn-claude-code
|
|
python s10_system_prompt/code.py
|
|
```
|
|
|
|
What to watch for:
|
|
|
|
1. Output shows which sections were loaded (`[assembled] sections: ...` label)
|
|
2. Cache hits show `[cache hit]` during continued conversation
|
|
3. Creating `.memory/MEMORY.md` makes the memory section appear on the next turn
|
|
|
|
Try these prompts:
|
|
|
|
1. `Read the file README.md` (observe the three always-loaded sections)
|
|
2. `Create a file called .memory/MEMORY.md with content "- [test](test.md) — test memory"` (write a memory index)
|
|
3. `Read the file code.py` (observe whether the memory section appears)
|
|
|
|
---
|
|
|
|
## What's Next
|
|
|
|
System prompts can now be assembled at runtime. But the agent still crashes on errors. Network hiccups, API rate limits, truncated output, context overflow — these aren't bugs, they're normal.
|
|
|
|
s11 Error Recovery → four recovery paths. Upgrade tokens, compress context, exponential backoff, switch models.
|
|
|
|
<details>
|
|
<summary>Deep Dive Into CC Source Code</summary>
|
|
|
|
> The following is based on analysis of CC source code `constants/prompts.ts` (914 lines), `constants/systemPromptSections.ts` (68 lines), `context.ts` (189 lines), `utils/api.ts` (718 lines), `utils/systemPrompt.ts` (123 lines), and `bootstrap/state.ts`.
|
|
|
|
### How many sections does CC's system prompt have?
|
|
|
|
The count varies based on feature flags, output style, KAIROS/Proactive mode, user type, token budget, etc. Roughly two categories:
|
|
|
|
**Static sections** (always loaded): identity, system, doing_tasks, actions, using_tools, tone_style, output_efficiency, etc.
|
|
|
|
**Dynamic sections** (loaded by state): session_guidance, memory, ant_model_override, env_info_simple, language, output_style, mcp_instructions, scratchpad, frc, summarize_tool_results, numeric_length_anchors, token_budget, brief, etc.
|
|
|
|
`mcp_instructions` is the only volatile section (created via `DANGEROUS_uncachedSystemPromptSection()`), because MCP servers can connect and disconnect between turns.
|
|
|
|
### Assembly Function
|
|
|
|
```typescript
|
|
getSystemPrompt(tools, model, additionalWorkingDirs?, mcpClients?): Promise<string[]>
|
|
```
|
|
|
|
Returns `string[]` (each element is a section), separated by `SYSTEM_PROMPT_DYNAMIC_BOUNDARY` between static and dynamic parts.
|
|
|
|
### cache scope
|
|
|
|
When global cache boundary is enabled, static sections are merged into one global cache block, and dynamic sections don't use global cache (`cacheScope: null`). Only paths without boundary or skipping global cache fall back to org scope.
|
|
|
|
The teaching version's cache only avoids redundant string assembly. CC's three-layer cache:
|
|
|
|
1. **lodash memoize**: `getSystemContext` and `getUserContext` cached per session (`context.ts`)
|
|
2. **Section registry cache**: `STATE.systemPromptSectionCache` caches dynamic section results, cleared on `/clear` or `/compact`
|
|
3. **API-level cache**: `splitSysPromptPrefix()` (`api.ts`) splits prompt into blocks with different cache scopes via boundary
|
|
|
|
### getUserContext vs getSystemContext
|
|
|
|
| | getSystemContext | getUserContext |
|
|
|---|---|---|
|
|
| Content | gitStatus, cacheBreaker | CLAUDE.md content, currentDate |
|
|
| Injection | appended to system prompt array | prepended as `<system-reminder>` user message |
|
|
| When skipped | custom system prompt | always runs |
|
|
|
|
### How modes change the prompt
|
|
|
|
- **CLAUDE_CODE_SIMPLE**: entire prompt is 2 lines
|
|
- **Proactive/KAIROS**: compact prompt replaces all standard sections
|
|
- **Coordinator**: coordinator-specific prompt fully replaces default
|
|
- **Agent mode**: agent-defined prompt replaces or appends to default
|
|
|
|
### Total size
|
|
|
|
Standard interactive mode system prompt core is ~20-30KB text. CLAUDE_CODE_SIMPLE is ~150 characters. User context (CLAUDE.md) and system context (git status) add on top.
|
|
|
|
</details>
|
|
|
|
<!-- translation-sync: zh@v1, en@v1, ja@v1 -->
|