analysis_claude_code/docs/en/s06-context-compact.md
CrazyBoyM a9c71002d2 the model is the agent, the code is the harness
Comprehensive rewrite establishing the harness engineering narrative
across the entire repository.

README (EN/ZH/JA): added "The Model IS the Agent" manifesto with
historical proof (DQN, OpenAI Five, AlphaStar, Tencent Jueyu),
"What an Agent Is NOT" critique, harness engineer role definition,
"Why Claude Code" as masterclass in harness design, and universe
vision. Consistent framing: model = driver, harness = vehicle.

docs (36 files, 3 languages): injected one-line "Harness layer"
callout after the motto in every session document (s01-s12).

agents (13 Python files): added harness framing comment before
each module docstring.

skills/agent-philosophy.md: full rewrite aligned with harness
narrative.
2026-03-18 01:19:34 +08:00

126 lines
4.3 KiB
Markdown

# s06: Context Compact
`s01 > s02 > s03 > s04 > s05 > [ s06 ] | s07 > s08 > s09 > s10 > s11 > s12`
> *"Context will fill up; you need a way to make room"* -- three-layer compression strategy for infinite sessions.
>
> **Harness layer**: Compression -- clean memory for infinite sessions.
## Problem
The context window is finite. A single `read_file` on a 1000-line file costs ~4000 tokens. After reading 30 files and running 20 bash commands, you hit 100,000+ tokens. The agent cannot work on large codebases without compression.
## Solution
Three layers, increasing in aggressiveness:
```
Every turn:
+------------------+
| Tool call result |
+------------------+
|
v
[Layer 1: micro_compact] (silent, every turn)
Replace tool_result > 3 turns old
with "[Previous: used {tool_name}]"
|
v
[Check: tokens > 50000?]
| |
no yes
| |
v v
continue [Layer 2: auto_compact]
Save transcript to .transcripts/
LLM summarizes conversation.
Replace all messages with [summary].
|
v
[Layer 3: compact tool]
Model calls compact explicitly.
Same summarization as auto_compact.
```
## How It Works
1. **Layer 1 -- micro_compact**: Before each LLM call, replace old tool results with placeholders.
```python
def micro_compact(messages: list) -> list:
tool_results = []
for i, msg in enumerate(messages):
if msg["role"] == "user" and isinstance(msg.get("content"), list):
for j, part in enumerate(msg["content"]):
if isinstance(part, dict) and part.get("type") == "tool_result":
tool_results.append((i, j, part))
if len(tool_results) <= KEEP_RECENT:
return messages
for _, _, part in tool_results[:-KEEP_RECENT]:
if len(part.get("content", "")) > 100:
part["content"] = f"[Previous: used {tool_name}]"
return messages
```
2. **Layer 2 -- auto_compact**: When tokens exceed threshold, save full transcript to disk, then ask the LLM to summarize.
```python
def auto_compact(messages: list) -> list:
# Save transcript for recovery
transcript_path = TRANSCRIPT_DIR / f"transcript_{int(time.time())}.jsonl"
with open(transcript_path, "w") as f:
for msg in messages:
f.write(json.dumps(msg, default=str) + "\n")
# LLM summarizes
response = client.messages.create(
model=MODEL,
messages=[{"role": "user", "content":
"Summarize this conversation for continuity..."
+ json.dumps(messages, default=str)[:80000]}],
max_tokens=2000,
)
return [
{"role": "user", "content": f"[Compressed]\n\n{response.content[0].text}"},
{"role": "assistant", "content": "Understood. Continuing."},
]
```
3. **Layer 3 -- manual compact**: The `compact` tool triggers the same summarization on demand.
4. The loop integrates all three:
```python
def agent_loop(messages: list):
while True:
micro_compact(messages) # Layer 1
if estimate_tokens(messages) > THRESHOLD:
messages[:] = auto_compact(messages) # Layer 2
response = client.messages.create(...)
# ... tool execution ...
if manual_compact:
messages[:] = auto_compact(messages) # Layer 3
```
Transcripts preserve full history on disk. Nothing is truly lost -- just moved out of active context.
## What Changed From s05
| Component | Before (s05) | After (s06) |
|----------------|------------------|----------------------------|
| Tools | 5 | 5 (base + compact) |
| Context mgmt | None | Three-layer compression |
| Micro-compact | None | Old results -> placeholders|
| Auto-compact | None | Token threshold trigger |
| Transcripts | None | Saved to .transcripts/ |
## Try It
```sh
cd learn-claude-code
python agents/s06_context_compact.py
```
1. `Read every Python file in the agents/ directory one by one` (watch micro-compact replace old results)
2. `Keep reading files until compression triggers automatically`
3. `Use the compact tool to manually compress the conversation`