Follow up PR #265: refine chapters, diagrams, and add S20 (#283)

* feat: s01-s14 docs quality overhaul — tool pipeline, single-agent, knowledge & resilience Rewrite code.py and README (zh/en/ja) for s01-s14, each chapter building incrementally on the previous. Key fixes across chapters: - s01-s04: agent loop, tool dispatch, permission pipeline, hooks - s05-s08: todo write, subagent, skill loading, context compact - s09-s11: memory system, system prompt assembly, error recovery - s12-s14: task graph, background tasks, cron scheduler All chapters CC source-verified. Code inherits fixes forward (PROMPT_SECTIONS, json.dumps cache, real-state context, can_start dep protection, etc.). * feat: s15-s19 docs quality overhaul — multi-agent platform: teams, protocols, autonomy, worktree, MCP tools Rewrite code.py and README (zh/en/ja) for s15-s19, the multi-agent platform chapters. Each chapter inherits all previous fixes and adds one mechanism: - s15: agent teams (TeamCreate, teammate threads, shared task list) - s16: team protocols (plan approval, shutdown handshake, consume_inbox) - s17: autonomous agents (idle polling, auto-claim, consume_lead_inbox) - s18: worktree isolation (git worktree, bind_task, cwd switching, safety) - s19: MCP tools (MCPClient, normalize_mcp_name, assemble_tool_pool, no cache) All appendix source code references verified against CC source. Config priority corrected: claude.ai < plugin < user < project < local. * fix: 5 regressions across s05-s19 — glob safety, todo validation, memory extraction, protocol types, dep crash - s05-s09: glob results now filter with is_relative_to(WORKDIR) (inherited from s02) - s06-s08: todo_write validates content/status required fields (inherited from s05) - s09: extract_memories uses pre-compression snapshot instead of compacted messages - s16: submit_plan docstring clarifies protocol-only (not code-level gate) - s17-s19: match_response restores type mismatch validation (from s16) - s17-s19: claim_task deps list handles missing dep files without crashing * fix: s12 Todo V2 logic reversal, s14/s15 cron range validation, s18/s19 worktree name validation - s12 README (zh/en/ja): fix Todo V2 direction — interactive defaults to Task, non-interactive/SDK defaults to TodoWrite. Fix env var name to CLAUDE_CODE_ENABLE_TASKS (not TODO_V2). - s14/s15: add _validate_cron_field with per-field range checks (minute 0-59, hour 0-23, dom 1-31, month 1-12, dow 0-6), step > 0, range lo <= hi. Replace old try/except validation that only caught exceptions. - s18/s19: add validate_worktree_name() to remove_worktree and keep_worktree, not just create_worktree. * fix: align s16-s19 teaching tool consistency * fix pr265 chapter diagrams * Add comprehensive s20 harness chapter * Fix chapter smoke test regressions * Clarify README tutorial track transition --------- Co-authored-by: Haoran <bill-billion@outlook.com>
2026-06-21 04:33:36 +08:00 · 2026-05-20 21:45:38 +08:00
parent c354cf7721
commit 1baf1aca5a
174 changed files with 35833 additions and 353 deletions
--- a/s08_context_compact/README.en.md
+++ b/s08_context_compact/README.en.md
@@ -0,0 +1,293 @@
+# s08: Context Compact — Context Will Fill Up, Have a Way to Make Room
+
+[中文](README.md) · [English](README.en.md) · [日本語](README.ja.md)
+
+s01 → s02 → s03 → s04 → s05 → s06 → s07 → `s08` → [s09](../s09_memory/) → s10 → ... → s20
+> *"Context will fill up — have a way to make room"* — Four-layer compression pipeline: cheap first, expensive last.
+>
+> **Harness Layer**: Compression — clean memory, unlimited sessions.
+
+---
+
+## The Problem
+
+The agent is running along, then freezes.
+
+It has bash, read, write — all the capabilities it needs. But it read a 1000-line file (~4000 tokens), then read 30 more files, ran 20 commands. Every command's output, every file's contents, all pile up in the `messages` list.
+
+The context window is finite. Once full, the API outright rejects the call: `prompt_too_long`.
+
+Without compression, an agent simply cannot work on large projects.
+
+---
+
+## The Solution
+
+![Compact Overview](images/compact-overview.en.svg)
+
+The hook structure, skill loading, and sub-Agent from s07 are preserved, with some tools omitted to focus on compaction. The core change: insert three pre-processors (0 API calls) before each LLM call, trigger an LLM summary (1 API call) when tokens still exceed the threshold, and emergency-trim if the API throws an error.
+
+Core design: cheap first, expensive last.
+
+---
+
+## How It Works
+
+![Four-layer compression pipeline](images/compaction-layers.en.svg)
+
+### L1: snip_compact — Trim Irrelevant Old Conversation
+
+The agent ran 80 turns of conversation, accumulating 160 `messages`. The very first "help me create hello.py" is barely relevant to current work, yet it still occupies space.
+
+Message count exceeds 50 → keep the first 3 (initial context) and the last 47 (current work), trim the middle:
+
+```python
+def snip_compact(messages, max_messages=50):
+    if len(messages) <= max_messages:
+        return messages
+    keep_head, keep_tail = 3, max_messages - 3
+    snipped = len(messages) - keep_head - keep_tail
+    placeholder = {"role": "user",
+                   "content": f"[snipped {snipped} messages from conversation middle]"}
+    return messages[:keep_head] + [placeholder] + messages[-keep_tail:]
+```
+
+Entire messages are trimmed, but `tool_result` content within remaining messages keeps accumulating — message #34 may still hold 30KB of old file contents. → L2.
+
+### L2: micro_compact — Placeholder for Old Tool Results
+
+![Old results placeholder](images/micro-compact.en.svg)
+
+The agent read 10 files consecutively. The full contents of reads 1–7 are still sitting in context, no longer needed, but hogging large amounts of space.
+
+Keep only the 3 most recent `tool_result` entries intact; replace older ones with a one-line placeholder:
+
+```python
+KEEP_RECENT_TOOL_RESULTS = 3
+
+def micro_compact(messages):
+    tool_results = collect_tool_result_blocks(messages)
+    if len(tool_results) <= KEEP_RECENT_TOOL_RESULTS:
+        return messages
+    for _, _, block in tool_results[:-KEEP_RECENT_TOOL_RESULTS]:
+        if len(block.get("content", "")) > 120:
+            block["content"] = "[Earlier tool result compacted. Re-run if needed.]"
+    return messages
+```
+
+Old results are cleared, but a single new result can be 500KB — one `cat` of a large file can max out the context. → L3.
+
+### L3: tool_result_budget — Persist Large Results to Disk
+
+![Large results to disk](images/layer1-budget.en.svg)
+
+The model read 5 large files in one go; all `tool_result` blocks in the last user message total 500KB.
+
+Sum the size of all `tool_result` blocks in the last user message. If over 200KB → sort by size, starting from the largest, persist to `.task_outputs/tool-results/`, keeping only a `<persisted-output>` marker + a 2000-character preview in context. The model sees the marker and knows the full content is on disk, re-reading it when needed.
+
+```python
+def tool_result_budget(messages, max_bytes=200_000):
+    last = messages[-1]
+    blocks = [(i, b) for i, b in enumerate(last["content"])
+              if b.get("type") == "tool_result"]
+    total = sum(len(str(b.get("content", ""))) for _, b in blocks)
+    if total <= max_bytes:
+        return messages
+    ranked = sorted(blocks, key=lambda p: len(str(p[1].get("content", ""))), reverse=True)
+    for idx, block in ranked:
+        if total <= max_bytes:
+            break
+        block["content"] = persist_large_output(block["tool_use_id"], str(block["content"]))
+        total = recalculate_total(blocks)
+    return messages
+```
+
+The first three layers are all plain-text / structural operations — 0 API calls — but they cannot "understand" conversation content. Context may still be too large. → L4.
+
+### L4: compact_history — Full LLM Summary
+
+![Full LLM summary](images/auto-compact.en.svg)
+
+All three previous layers have run, but after 30 minutes of continuous work on a huge project, tokens still exceed the threshold.
+
+Three-step process:
+
+1. **Save transcript**: Write the full conversation to `.transcripts/` in JSONL format. The transcript preserves a recoverable record, but the model's active context only contains the summary. For the model's current reasoning, the details are no longer in context. The teaching code does not provide a transcript retrieval tool.
+2. **LLM generates summary**: Send conversation history to the LLM, asking it to preserve key information: current goals, important findings, modified files, remaining work, user constraints, etc.
+3. **Replace message list**: All old messages are replaced with a single summary. The teaching version only keeps the summary; the real Claude Code re-attaches some recent files, plans, agent/skill/tool context after compaction.
+
+```python
+def compact_history(messages):
+    transcript_path = write_transcript(messages)  # Save full conversation first
+    summary = summarize_history(messages)          # LLM generates summary
+    return [{"role": "user",
+             "content": f"[Compacted]\n\n{summary}"}]
+```
+
+**Circuit breaker**: After 3 consecutive failures, stop retrying to prevent an infinite loop wasting API calls.
+
+### Reactive: reactive_compact
+
+Sometimes the API still returns `prompt_too_long` (413) — when context grows faster than compression triggers.
+
+This triggers **reactive_compact**: more aggressive than compact_history, it retreats from the tail, trimming to an API-acceptable size with byte-level precision, keeping only the last 5 messages + summary.
+
+```python
+def reactive_compact(messages):
+    transcript = write_transcript(messages)
+    summary = summarize_history(messages)
+    tail = messages[-5:]
+    return [{"role": "user",
+             "content": f"[Reactive compact]\n\n{summary}"}, *tail]
+```
+
+Reactive compact has a retry limit (default 1). If it still fails, an exception is raised instead of looping forever. Full error recovery is deferred to s11.
+
+### Putting It All Together
+
+```python
+def agent_loop(messages):
+    reactive_retries = 0
+    while True:
+        # Three pre-processors (0 API calls)
+        # Order: budget first, so large content is persisted before placeholders
+        messages[:] = tool_result_budget(messages)    # L3: persist large results
+        messages[:] = snip_compact(messages)          # L1: trim middle
+        messages[:] = micro_compact(messages)         # L2: old result placeholders
+
+        # Still too much? LLM summary (1 API call)
+        if estimate_token_count(messages) > THRESHOLD:
+            messages[:] = compact_history(messages)
+
+        try:
+            response = client.messages.create(...)
+        except PromptTooLongError:
+            if reactive_retries < MAX_REACTIVE_RETRIES:
+                messages[:] = reactive_compact(messages)  # Emergency
+                reactive_retries += 1
+                continue
+            raise  # retry limit exceeded, raise exception
+        # ... tool execution ...
+
+        # compact tool: when the model actively calls it, triggers compact_history
+        if block.name == "compact":
+            messages[:] = compact_history(messages)
+            results.append({..., "content": "[Compacted. History summarized.]"})
+            messages.append({"role": "user", "content": results})
+            break  # end current turn, start fresh with compacted context
+```
+
+**The order must not be swapped.** L3 (budget) runs before L2 (micro) because micro replaces old large tool_results with one-line placeholders — budget must persist the full content before that happens. This is why CC source puts `applyToolResultBudget` first.
+
+---
+
+## Changes From s07
+
+| Component | Before (s07) | After (s08) |
+|-----------|-------------|-------------|
+| Context management | None (context grows unbounded) | Four-layer compression pipeline + emergency |
+| New functions | — | snip_compact, micro_compact, tool_result_budget, compact_history, reactive_compact |
+| Tools | bash, read_file, write_file, edit_file, glob, todo_write, task, load_skill (8) | 8 + compact (9) |
+| Loop | LLM call → tool execution | Three pre-processors before each turn + threshold-triggered compact_history |
+| Design principle | — | Cheap first, expensive last |
+
+---
+
+## Try It
+
+```sh
+cd learn-claude-code
+python s08_context_compact/code.py
+```
+
+Try these prompts:
+
+1. `Read the file README.md, then read code.py, then read s01_agent_loop/README.md` (read multiple files consecutively, observe L2 compressing old results)
+2. `Read every file in s08_context_compact/` (read a large amount of content at once, observe L3 persisting to disk)
+3. Chat for 20+ turns, observe whether `[auto compact]` or `[reactive compact]` appears
+
+What to watch for: After each tool execution, are old `tool_result` entries compressed? When tokens exceed the threshold after extended conversation, is summarization triggered automatically?
+
+---
+
+## What's Next
+
+Context compression lets an agent run for a long time without crashing. But after each compression, the preferences and constraints the user told it are also lost. Can we let the agent selectively remember important things?
+
+s09 Memory → three subsystems: choosing what to remember, extracting key information, consolidating and organizing. Across compressions, across sessions.
+
+<details>
+<summary>Deep Dive Into CC Source Code</summary>
+
+> The following is based on analysis of CC source code `compact.ts`, `autoCompact.ts`, `microCompact.ts`, and `query.ts`.
+
+### Execution Order Comparison
+
+The teaching version labels layers L1/L2/L3/L4 for pedagogical clarity, but actual execution order does not match the numbering:
+
+| Dimension | Teaching Version | Claude Code |
+|-----------|-----------------|-------------|
+| Execution order | budget → snip → micro → auto | budget → snip → micro → collapse → auto (`query.ts:379-468`) |
+| snip_compact | Keep head 3 + tail 47 | CC only enables on main thread; implementation not in open-source repo (`HISTORY_SNIP` feature gate), but interface is visible: `snipCompactIfNeeded(messages)` → `{ messages, tokensFreed, boundaryMessage? }`, also exposes `SnipTool` for model-initiated snipping. Teaching version's 3/47 are simplified parameters |
+| micro_compact | Text placeholder replacement | Two paths: time-based clears content directly, cached uses API `cache_edits` (legacy path removed) |
+| micro_compact whitelist | By position (most recent 3) | time-based triggers by time threshold; cached triggers by count (`microCompact.ts`) |
+| tool_result_budget | 200KB characters | 200,000 characters (`toolLimits.ts:49`) |
+| compact_history threshold | Character count estimate | Precise tokens: `contextWindow - maxOutputTokens - 13_000` |
+| Summary requirements | 5 categories of info | 9 sections + `<analysis>`/`<summary>` dual tags |
+| Compression prompt | Simple prompt | Double-ended hard guardrails forbidding tool calls |
+| PTL retry | Yes (simplified) | `truncateHeadForPTLRetry()` retreats by message groups (`compact.ts:243-290`) |
+| Post-compaction recovery | None (teaching version only keeps summary) | Auto re-read recent files, plans, agent/skill/tool context |
+| Circuit breaker | 3 times | 3 times (`autoCompact.ts:70`) |
+| Reactive retry | 1 time | CC has more granular tiered retries |
+
+### Execution Order Details
+
+The real order in CC source `query.ts`:
+
+1. `applyToolResultBudget` (L379): persist large results first, ensuring full content is saved
+2. `snipCompact` (L403): trim middle messages
+3. `microcompact` (L414): old result placeholders
+4. `contextCollapse` (L441): independent context management system (not in teaching version)
+5. `autoCompact` (L454): LLM full summary
+
+The teaching version's budget → snip → micro order matches this. The teaching version does not have the contextCollapse mechanism.
+
+### Full Constant Reference
+
+| Constant | Value | Source File |
+|----------|-------|-------------|
+| `AUTOCOMPACT_BUFFER_TOKENS` | 13,000 | `autoCompact.ts:62` |
+| `MAX_CONSECUTIVE_AUTOCOMPACT_FAILURES` | 3 | `autoCompact.ts:70` |
+| `MAX_OUTPUT_TOKENS_FOR_SUMMARY` | 20,000 | `autoCompact.ts:30` |
+| `POST_COMPACT_TOKEN_BUDGET` | 50,000 | `compact.ts:123` |
+| `POST_COMPACT_MAX_FILES_TO_RESTORE` | 5 | `compact.ts:122` |
+| `POST_COMPACT_MAX_TOKENS_PER_FILE` | 5,000 | `compact.ts:124` |
+| Time micro_compact interval | 60 minutes | `timeBasedMCConfig.ts` |
+| `MAX_COMPACT_STREAMING_RETRIES` | 2 | `compact.ts:131` |
+
+### contextCollapse and sessionMemoryCompact
+
+CC source code has two additional mechanisms not covered in this teaching version:
+
+- **contextCollapse**: An independent context management system that, when enabled, suppresses proactive autocompact (`autoCompact.ts:215-222`), with collapse's commit/blocking flow taking over context management. Manual `/compact` and reactive fallback remain independent paths, unaffected by contextCollapse.
+- **sessionMemoryCompact**: Before compact_history, CC first attempts a lightweight summary using existing session memory (covered in s09) without calling the LLM. This mechanism becomes clearer after learning s09.
+
+### What Does the Compression Prompt Look Like?
+
+CC's compression prompt has two hard requirements:
+
+1. **Absolutely no tool calls**: It begins with `CRITICAL: Respond with TEXT ONLY. Do NOT call any tools.`, and appends another REMINDER at the end
+2. **Analyze first, then summarize**: The model must first reason in an `<analysis>` tag, then output the formal summary in a `<summary>` tag. The analysis is stripped during formatting
+
+### Teaching Version Simplifications Are Intentional
+
+- micro_compact uses text placeholders → we don't have API-level `cache_edits` access
+- Tokens estimated via character count → precise tokenizers are out of scope
+- Post-compaction recovery omitted → teaching version only keeps summary, does not auto re-attach files
+- Two auxiliary mechanisms not covered → they fall in the 10% detail category
+
+The core design principle, cheap first, expensive last, is fully preserved.
+
+</details>
+
+<!-- translation-sync: zh@v1, en@v1, ja@v1 -->
--- a/s08_context_compact/README.ja.md
+++ b/s08_context_compact/README.ja.md
@@ -0,0 +1,293 @@
+# s08: Context Compact — コンテキストはいつか満杯になる、場所を空ける方法が必要
+
+[中文](README.md) · [English](README.en.md) · [日本語](README.ja.md)
+
+s01 → s02 → s03 → s04 → s05 → s06 → s07 → `s08` → [s09](../s09_memory/) → s10 → ... → s20
+> *"Context will fill up — have a way to make room"* — 4層圧縮戦略、安価なものを先に、高価なものを後に実行。
+>
+> **Harness レイヤー**: 圧縮 — クリーンな記憶、無限のセッション。
+
+---
+
+## 課題
+
+Agent が動いている途中で、止まってしまう。
+
+bash、read、write は揃っており、能力は十分。しかし 1000 行のファイル（~4000 token）を読み、さらに 30 のファイルを読み、20 のコマンドを実行したとします。各コマンドの出力、各ファイルの内容がすべて `messages` リストに蓄積されます。
+
+コンテキストウィンドウには上限があります。満杯になると、API は即座に拒否します：`prompt_too_long`。
+
+圧縮しなければ、Agent は大規模プロジェクトではまともに動けません。
+
+---
+
+## ソリューション
+
+![Compact Overview](images/compact-overview.ja.svg)
+
+s07 のフック構造、スキルロード、サブ Agent の骨格を維持し、圧縮に焦点を当てるため一部のツールは省略。コアの変更点：各 LLM 呼び出し前に 3 層のプリプロセッサ（0 API）を挿入し、token が閾値を超えた場合は LLM 要約（1 API）をトリガー、API エラー時には緊急トリムを実行。
+
+コア設計：安価なものを先に、高価なものを後に。
+
+---
+
+## 仕組み
+
+![4層圧縮パイプライン](images/compaction-layers.ja.svg)
+
+### L1: snip_compact — 無関係な古い会話を切り捨て
+
+Agent が 80 ラウンドの会話を実行し、`messages` が 160 件まで溜まった。先頭の「hello.py を作って」は現在の作業とほぼ無関係だが、スペースを占有し続けている。
+
+メッセージ数が 50 を超えた場合 → 先頭 3 件（初期コンテキスト）と末尾 47 件（現在の作業）を保持し、中間を切り捨て：
+
+```python
+def snip_compact(messages, max_messages=50):
+    if len(messages) <= max_messages:
+        return messages
+    keep_head, keep_tail = 3, max_messages - 3
+    snipped = len(messages) - keep_head - keep_tail
+    placeholder = {"role": "user",
+                   "content": f"[snipped {snipped} messages from conversation middle]"}
+    return messages[:keep_head] + [placeholder] + messages[-keep_tail:]
+```
+
+メッセージ全体は切り捨てたが、残ったメッセージ内の `tool_result` 内容はまだ蓄積され続けている。34 番目のメッセージに 30KB の古いファイル内容が残っているかもしれない。→ L2。
+
+### L2: micro_compact — 古いツール結果をプレースホルダに置換
+
+![古い結果のプレースホルダ](images/micro-compact.ja.svg)
+
+Agent が連続して 10 個のファイルを読んだ。1〜7 回目の完全な内容はまだコンテキストに残っており、もう不要だが、大量のスペースを占有している。
+
+直近 3 件の `tool_result` の完全な内容のみを保持し、それより古いものは 1 行のプレースホルダに置換：
+
+```python
+KEEP_RECENT_TOOL_RESULTS = 3
+
+def micro_compact(messages):
+    tool_results = collect_tool_result_blocks(messages)
+    if len(tool_results) <= KEEP_RECENT_TOOL_RESULTS:
+        return messages
+    for _, _, block in tool_results[:-KEEP_RECENT_TOOL_RESULTS]:
+        if len(block.get("content", "")) > 120:
+            block["content"] = "[Earlier tool result compacted. Re-run if needed.]"
+    return messages
+```
+
+古い結果はクリーンアップされたが、1 件の新しい結果だけで 500KB の可能性がある。大きなファイルを `cat` するだけでコンテキストがいっぱいになる。→ L3。
+
+### L3: tool_result_budget — 大きな結果をディスクに退避
+
+![大きな結果のディスク退避](images/layer1-budget.ja.svg)
+
+モデルが一度に 5 つの大きなファイルを読み、1 つの user メッセージ内の全 `tool_result` の合計が 500KB に達した。
+
+最後の user メッセージ内のすべての `tool_result` の合計サイズを集計。200KB を超えた場合 → サイズ順にソートし、最大のものから順に `.task_outputs/tool-results/` に退避。コンテキストには `<persisted-output>` マーカー + 先頭 2000 文字のプレビューのみを残す。モデルはマーカーを見て完全な内容がディスク上にあることを認識し、必要に応じて再読み込みできる。
+
+```python
+def tool_result_budget(messages, max_bytes=200_000):
+    last = messages[-1]
+    blocks = [(i, b) for i, b in enumerate(last["content"])
+              if b.get("type") == "tool_result"]
+    total = sum(len(str(b.get("content", ""))) for _, b in blocks)
+    if total <= max_bytes:
+        return messages
+    ranked = sorted(blocks, key=lambda p: len(str(p[1].get("content", ""))), reverse=True)
+    for idx, block in ranked:
+        if total <= max_bytes:
+            break
+        block["content"] = persist_large_output(block["tool_use_id"], str(block["content"]))
+        total = recalculate_total(blocks)
+    return messages
+```
+
+最初の 3 層はすべて純粋なテキスト/構造操作（0 API 呼び出し）だが、会話内容を「理解」することはできない。コンテキストがまだ大きすぎる可能性がある。→ L4。
+
+### L4: compact_history — LLM 全量要約
+
+![LLM 全量要約](images/auto-compact.ja.svg)
+
+最初の 3 層がすべて実行されたが、超大規模プロジェクトで 30 分間連続作業すると、token がまだ閾値を超えている。
+
+3 ステップのフロー：
+
+1. **transcript を保存**：完全な会話を `.transcripts/` に JSONL 形式で書き出す。transcript は回復可能な記録として保存されるが、モデルのアクティブなコンテキストには要約しか残らない。モデルの現在の推論にとって、詳細はすでにコンテキストにない。教学コードは transcript 検索ツールを提供しない。
+2. **LLM で要約を生成**：会話履歴を LLM に送り、現在の目標、重要な発見、変更済みファイル、残りの作業、ユーザーの制約などの重要な情報を保持するよう指示。
+3. **メッセージリストを置換**：すべての古いメッセージが 1 件の要約に置き換えられる。教学版は要約のみを保持する。実際の Claude Code は compact 後に直近のファイル、計画、agent/skill/tool などのコンテキストを再付加する。
+
+```python
+def compact_history(messages):
+    transcript_path = write_transcript(messages)  # 先に完全な会話を保存
+    summary = summarize_history(messages)          # LLM で要約を生成
+    return [{"role": "user",
+             "content": f"[Compacted]\n\n{summary}"}]
+```
+
+**サーキットブレーカー**：連続 3 回失敗したらリトライを停止し、無限ループによる API 呼び出しの浪費を防止。
+
+### 緊急: reactive_compact
+
+API がまだ `prompt_too_long`（413）を返すことがある。コンテキストの増加速度が圧縮のトリガー速度を上回る場合。
+
+この時 **reactive_compact** がトリガーされる：compact_history よりもさらに積極的で、末尾からバイト単位の精度で API が受け入れ可能なサイズまで切り詰め、最後の 5 件のメッセージ + 要約のみを保持。
+
+```python
+def reactive_compact(messages):
+    transcript = write_transcript(messages)
+    summary = summarize_history(messages)
+    tail = messages[-5:]
+    return [{"role": "user",
+             "content": f"[Reactive compact]\n\n{summary}"}, *tail]
+```
+
+reactive compact にはリトライ上限がある（デフォルト 1 回）。さらに失敗した場合は例外をスローし、無限ループしない。完全なエラー回復ロジックは s11 に委ねる。
+
+### 合わせて実行
+
+```python
+def agent_loop(messages):
+    reactive_retries = 0
+    while True:
+        # 3 つのプリプロセッサ（0 API 呼び出し）
+        # 順序：budget を先に実行し、大きな内容をプレースホルダ化する前に退避
+        messages[:] = tool_result_budget(messages)    # L3: 大きな結果を退避
+        messages[:] = snip_compact(messages)          # L1: 中間を切り捨て
+        messages[:] = micro_compact(messages)         # L2: 古い結果をプレースホルダに
+
+        # まだ足りない？LLM 要約（1 API 呼び出し）
+        if estimate_token_count(messages) > THRESHOLD:
+            messages[:] = compact_history(messages)
+
+        try:
+            response = client.messages.create(...)
+        except PromptTooLongError:
+            if reactive_retries < MAX_REACTIVE_RETRIES:
+                messages[:] = reactive_compact(messages)  # 緊急対応
+                reactive_retries += 1
+                continue
+            raise  # リトライ上限超過、例外をスロー
+        # ... ツール実行 ...
+
+        # compact ツール：モデルが能動的に呼び出した場合、compact_history をトリガー
+        if block.name == "compact":
+            messages[:] = compact_history(messages)
+            results.append({..., "content": "[Compacted. History summarized.]"})
+            messages.append({"role": "user", "content": results})
+            break  # 現在のターンを終了し、圧縮後のコンテキストで新しく開始
+```
+
+**順序は変えられない。** L3（budget）が L2（micro）の前に実行される理由：micro は古い大きな tool_result を 1 行のプレースホルダに置換するため、budget はその前に完全な内容を退避させる必要がある。CC ソースが `applyToolResultBudget` を最初に配置する理由も同じ。
+
+---
+
+## s07 からの変更点
+
+| コンポーネント | 変更前 (s07) | 変更後 (s08) |
+|------|-----------|-----------|
+| コンテキスト管理 | なし（コンテキストが無限に膨張） | 4 層圧縮パイプライン + 緊急対応 |
+| 新規関数 | — | snip_compact, micro_compact, tool_result_budget, compact_history, reactive_compact |
+| ツール | bash, read_file, write_file, edit_file, glob, todo_write, task, load_skill (8) | 8 + compact (9) |
+| ループ | LLM 呼び出し → ツール実行 | 各ラウンド前に 3 層プリプロセッサを実行 + 閾値で compact_history をトリガー |
+| 設計原則 | — | 安価なものを先に、高価なものを後に |
+
+---
+
+## 試してみよう
+
+```sh
+cd learn-claude-code
+python s08_context_compact/code.py
+```
+
+以下のプロンプトを試してみてください：
+
+1. `Read the file README.md, then read code.py, then read s01_agent_loop/README.md`（連続して複数のファイルを読み、L2 の古い結果圧縮を観察）
+2. `Read every file in s08_context_compact/`（一度に大量の内容を読み込み、L3 のディスク退避を観察）
+3. 20+ ラウンドの対話を繰り返し、`[auto compact]` または `[reactive compact]` が表示されるか観察
+
+観察のポイント：ツール実行のたびに、古い tool_result は圧縮されているか？連続対話で token が閾値を超えたとき、要約が自動的にトリガーされたか？
+
+---
+
+## 次へ
+
+コンテキスト圧縮により、Agent は長時間クラッシュせずに動けるようになった。しかし、圧縮のたびにユーザーが以前に伝えた偏好や制約も一緒に失われてしまう。Agent が重要なことを選択的に記憶できるようにできないか？
+
+s09 Memory → 3 つのサブシステム：何を記憶するかの選択、重要情報の抽出、整理と統合。圧縮を越え、セッションを越えて。
+
+<details>
+<summary>CC ソースコードの詳細</summary>
+
+> 以下は CC ソースコード `compact.ts`、`autoCompact.ts`、`microCompact.ts`、`query.ts` の分析に基づく。
+
+### 実行順序の対応
+
+教学版は説明の便宜上 L1/L2/L3/L4 と番号を振っているが、実際の実行順序は番号と完全には一致しない：
+
+| 項目 | 教学版 | Claude Code |
+|------|--------|-------------|
+| 実行順序 | budget → snip → micro → auto | budget → snip → micro → collapse → auto（`query.ts:379-468`） |
+| snip_compact | 先頭 3 + 末尾 47 を保持 | CC はメインスレッドのみ有効；実装はオープンソースリポジトリにない（`HISTORY_SNIP` feature gate）、インターフェースは確認可能：`snipCompactIfNeeded(messages)` → `{ messages, tokensFreed, boundaryMessage? }`、`SnipTool` もモデルが能動的に呼び出し可能。教学版の 3/47 は簡略パラメータ |
+| micro_compact | テキストプレースホルダで置換 | 2 つのパス：time-based は直接内容をクリア、cached は API の `cache_edits` を使用（legacy パスは削除済み） |
+| micro_compact ホワイトリスト | 位置による（直近 3 件） | time-based は時間閾値でトリガー、cached はカウントでトリガー（`microCompact.ts`） |
+| tool_result_budget | 200KB 文字 | 200,000 文字（`toolLimits.ts:49`） |
+| compact_history 閾値 | 文字数で推定 | 精密な token 数：`contextWindow - maxOutputTokens - 13_000` |
+| 要約の要求 | 5 種類の情報 | 9 つのセクション + `<analysis>`/`<summary>` デュアルタグ |
+| 圧縮プロンプト | シンプルなプロンプト | 先頭と末尾に二重の安全ガードでツール呼び出しを禁止 |
+| PTL retry | あり（簡略版） | `truncateHeadForPTLRetry()` がメッセージグループ単位でロールバック（`compact.ts:243-290`） |
+| 圧縮後のリカバリ | なし（教学版は要約のみ保持） | 直近のファイル、計画、agent/skill/tool などの自動再付加 |
+| サーキットブレーカー | 3 回 | 3 回（`autoCompact.ts:70`） |
+| reactive リトライ | 1 回 | CC にはより精緻な段階別リトライがある |
+
+### 実行順序の詳細
+
+CC ソース `query.ts` での実際の順序：
+
+1. `applyToolResultBudget`（L379）：まず大きな結果を処理し、完全な内容を退避
+2. `snipCompact`（L403）：中間メッセージを切り捨て
+3. `microcompact`（L414）：古い結果のプレースホルダ化
+4. `contextCollapse`（L441）：独立したコンテキスト管理システム（教学版にはなし）
+5. `autoCompact`（L454）：LLM 全量要約
+
+教学版の budget → snip → micro の順序はこれと一致する。教学版には contextCollapse メカニズムがない。
+
+### 完全な定数リファレンス
+
+| 定数 | 値 | ソースファイル |
+|------|-----|--------|
+| `AUTOCOMPACT_BUFFER_TOKENS` | 13,000 | `autoCompact.ts:62` |
+| `MAX_CONSECUTIVE_AUTOCOMPACT_FAILURES` | 3 | `autoCompact.ts:70` |
+| `MAX_OUTPUT_TOKENS_FOR_SUMMARY` | 20,000 | `autoCompact.ts:30` |
+| `POST_COMPACT_TOKEN_BUDGET` | 50,000 | `compact.ts:123` |
+| `POST_COMPACT_MAX_FILES_TO_RESTORE` | 5 | `compact.ts:122` |
+| `POST_COMPACT_MAX_TOKENS_PER_FILE` | 5,000 | `compact.ts:124` |
+| 時間ベース micro_compact 間隔 | 60 分 | `timeBasedMCConfig.ts` |
+| `MAX_COMPACT_STREAMING_RETRIES` | 2 | `compact.ts:131` |
+
+### contextCollapse と sessionMemoryCompact
+
+CC ソースコードには、この教学版では展開していない 2 つのメカニズムが存在する：
+
+- **contextCollapse**：独立したコンテキスト管理システム。有効時には proactive autocompact を抑制し（`autoCompact.ts:215-222`）、collapse の commit/blocking フローがコンテキスト管理を引き継ぐ。ただし manual `/compact` と reactive fallback は独立パスのままで、contextCollapse の影響を受けない。
+- **sessionMemoryCompact**：compact_history の前に、CC は既存の session memory（s09 で解説）を使った軽量要約を先に試みる。LLM を呼び出さない。このメカニズムは s09 を学んだ後に振り返るとより理解しやすい。
+
+### 圧縮プロンプトの中身
+
+CC の圧縮プロンプトには 2 つの厳格な要件がある：
+
+1. **ツール呼び出しの絶対禁止**：冒頭が `CRITICAL: Respond with TEXT ONLY. Do NOT call any tools.` で、末尾にも再度 REMINDER がある
+2. **先に分析してから要約**：モデルはまず `<analysis>` タグで思考を整理し、その後 `<summary>` タグで正式な要約を出力する。analysis はフォーマット時に除去される
+
+### 教学版の簡略化は意図的
+
+- micro_compact でテキストプレースホルダを使用 → API 層の `cache_edits` 権限がないため
+- token を文字数で推定 → 精密な tokenizer は教学の対象外
+- 圧縮後のリカバリを省略 → 教学版は要約のみを保持し、ファイルの自動再付加を行わない
+- 2 つの補助メカニズムを展開しない → 10% の細部に属する
+
+コア設計思想、安価なものを先に高価なものを後に、は完全に保持されている。
+
+</details>
+
+<!-- translation-sync: zh@v1, en@v1, ja@v1 -->
--- a/s08_context_compact/README.md
+++ b/s08_context_compact/README.md
@@ -0,0 +1,293 @@
+# s08: Context Compact — 上下文总会满，要有办法腾地方
+
+[中文](README.md) · [English](README.en.md) · [日本語](README.ja.md)
+
+s01 → s02 → s03 → s04 → s05 → s06 → s07 → `s08` → [s09](../s09_memory/) → s10 → ... → s20
+> *"上下文总会满, 要有办法腾地方"* — 四层压缩策略, 便宜的先跑贵的后跑。
+>
+> **Harness 层**: 压缩 — 干净的记忆, 无限的会话。
+
+---
+
+## 问题
+
+Agent 跑着跑着，不动了。
+
+手里有 bash、有 read、有 write，能力是够的。但它读了一个 1000 行的文件（~4000 token），又读了 30 个文件，跑了 20 条命令。每条命令的输出、每个文件的内容，全都堆在 `messages` 列表里。
+
+上下文窗口是有限的。满了之后，API 直接拒绝：`prompt_too_long`。
+
+不压缩，Agent 根本没法在大项目里干活。
+
+---
+
+## 解决方案
+
+![Compact Overview](images/compact-overview.svg)
+
+保留 s07 的 hook 结构、技能加载、子 Agent 等骨架，省略部分工具细节以聚焦压缩。核心变动：每轮 LLM 调用前插入三层预处理器（0 API），token 仍超阈值时触发 LLM 摘要（1 API），API 报错时应急裁剪。
+
+核心设计：便宜的先跑，贵的后跑。
+
+---
+
+## 工作原理
+
+![四层压缩管线](images/compaction-layers.svg)
+
+### L1: snip_compact — 裁掉无关的旧对话
+
+Agent 跑了 80 轮对话，`messages` 攒了 160 条。最前面的"帮我创建 hello.py"和当前工作几乎无关了，但全占着位置。
+
+消息数超过 50 条 → 保留头部 3 条（初始上下文）和尾部 47 条（当前工作），中间裁掉：
+
+```python
+def snip_compact(messages, max_messages=50):
+    if len(messages) <= max_messages:
+        return messages
+    keep_head, keep_tail = 3, max_messages - 3
+    snipped = len(messages) - keep_head - keep_tail
+    placeholder = {"role": "user",
+                   "content": f"[snipped {snipped} messages from conversation middle]"}
+    return messages[:keep_head] + [placeholder] + messages[-keep_tail:]
+```
+
+裁掉了整条消息，但剩下的消息里 `tool_result` 内容仍在累积——第 34 条消息里可能躺着 30KB 的旧文件内容。→ L2。
+
+### L2: micro_compact — 旧工具结果占位
+
+![旧结果占位](images/micro-compact.svg)
+
+Agent 连续读了 10 个文件。第 1-7 次的完整内容还躺在上下文里，早就不需要了，但占着大量空间。
+
+只保留最近 3 条 `tool_result` 的完整内容，更旧的替换为一行占位符：
+
+```python
+KEEP_RECENT_TOOL_RESULTS = 3
+
+def micro_compact(messages):
+    tool_results = collect_tool_result_blocks(messages)
+    if len(tool_results) <= KEEP_RECENT_TOOL_RESULTS:
+        return messages
+    for _, _, block in tool_results[:-KEEP_RECENT_TOOL_RESULTS]:
+        if len(block.get("content", "")) > 120:
+            block["content"] = "[Earlier tool result compacted. Re-run if needed.]"
+    return messages
+```
+
+旧结果清掉了，但单条新结果可能就有 500KB——一个 `cat` 大文件的输出就能打满上下文。→ L3。
+
+### L3: tool_result_budget — 大结果落盘
+
+![大结果落盘](images/layer1-budget.svg)
+
+模型一次读了 5 个大文件，单条 user 消息里所有 `tool_result` 加起来 500KB。
+
+统计最后一条 user 消息里所有 `tool_result` 的总大小。超过 200KB → 按大小排序，从最大的开始落盘到 `.task_outputs/tool-results/`，上下文里只留 `<persisted-output>` 标记 + 前 2000 字符预览。模型看到标记后知道完整内容在磁盘上，需要时可以重新读。
+
+```python
+def tool_result_budget(messages, max_bytes=200_000):
+    last = messages[-1]
+    blocks = [(i, b) for i, b in enumerate(last["content"])
+              if b.get("type") == "tool_result"]
+    total = sum(len(str(b.get("content", ""))) for _, b in blocks)
+    if total <= max_bytes:
+        return messages
+    ranked = sorted(blocks, key=lambda p: len(str(p[1].get("content", ""))), reverse=True)
+    for idx, block in ranked:
+        if total <= max_bytes:
+            break
+        block["content"] = persist_large_output(block["tool_use_id"], str(block["content"]))
+        total = recalculate_total(blocks)
+    return messages
+```
+
+前三层都是纯文本/结构操作，0 API 调用，但也无法"理解"对话内容。上下文可能仍然太大。→ L4。
+
+### L4: compact_history — LLM 全量摘要
+
+![LLM 全量摘要](images/auto-compact.svg)
+
+前三层全跑完了，但在超大项目中连续工作 30 分钟后，token 仍然超过阈值。
+
+三步流程：
+
+1. **保存 transcript**：完整对话写入 `.transcripts/`，JSONL 格式。transcript 保留了可恢复记录，但模型的活跃上下文里只剩摘要。对模型当下推理来说，细节已经不在上下文中了。教学代码没有提供 transcript 检索工具。
+2. **LLM 生成摘要**：把对话历史发给 LLM，要求保留当前目标、重要发现、已改文件、剩余工作、用户约束等关键信息。
+3. **替换消息列表**：所有旧消息被替换为一条摘要。教学版只保留摘要；真实 Claude Code 会在 compact 后重新附加部分最近文件、计划、agent/skill/tool 等上下文。
+
+```python
+def compact_history(messages):
+    transcript_path = write_transcript(messages)  # 先保存完整对话
+    summary = summarize_history(messages)          # LLM 生成摘要
+    return [{"role": "user",
+             "content": f"[Compacted]\n\n{summary}"}]
+```
+
+**熔断器**：连续失败 3 次后停止重试，防止死循环浪费 API 调用。
+
+### 应急: reactive_compact
+
+有时候 API 还是返回 `prompt_too_long`（413），上下文增长速度快于压缩触发速度时。
+
+这时触发 **reactive_compact**：比 compact_history 更激进，从尾部回退，以字节级精度裁剪到 API 可接受的大小，只保留最后 5 条消息 + 摘要。
+
+```python
+def reactive_compact(messages):
+    transcript = write_transcript(messages)
+    summary = summarize_history(messages)
+    tail = messages[-5:]
+    return [{"role": "user",
+             "content": f"[Reactive compact]\n\n{summary}"}, *tail]
+```
+
+reactive compact 有重试上限（默认 1 次）。再失败就抛出异常，不无限循环。完整的错误恢复逻辑留给 s11。
+
+### 合起来跑
+
+```python
+def agent_loop(messages):
+    reactive_retries = 0
+    while True:
+        # 三个预处理器（0 API 调用）
+        # 顺序：budget 先跑，确保大内容落盘后再做占位和裁剪
+        messages[:] = tool_result_budget(messages)    # L3: 大结果落盘
+        messages[:] = snip_compact(messages)          # L1: 裁中间
+        messages[:] = micro_compact(messages)         # L2: 旧结果占位
+
+        # 还不够？LLM 摘要（1 API 调用）
+        if estimate_token_count(messages) > THRESHOLD:
+            messages[:] = compact_history(messages)
+
+        try:
+            response = client.messages.create(...)
+        except PromptTooLongError:
+            if reactive_retries < MAX_REACTIVE_RETRIES:
+                messages[:] = reactive_compact(messages)  # 应急
+                reactive_retries += 1
+                continue
+            raise  # 超过重试上限，抛出异常
+        # ... 工具执行 ...
+
+        # compact 工具：模型主动调用时触发 compact_history
+        if block.name == "compact":
+            messages[:] = compact_history(messages)
+            results.append({..., "content": "[Compacted. History summarized.]"})
+            messages.append({"role": "user", "content": results})
+            break  # 结束当前 turn，用压缩后的上下文开始新一轮
+```
+
+**顺序不能换。** L3（budget）在 L2（micro）前面，因为 micro 会把旧的大 tool_result 替换成一行占位符，budget 必须在那之前把完整内容落盘。这也是为什么 CC 源码把 `applyToolResultBudget` 放在最前面。
+
+---
+
+## 相对 s07 的变更
+
+| 组件 | 之前 (s07) | 之后 (s08) |
+|------|-----------|-----------|
+| 上下文管理 | 无（上下文无限膨胀） | 四层压缩管线 + 应急 |
+| 新函数 | — | snip_compact, micro_compact, tool_result_budget, compact_history, reactive_compact |
+| 工具 | bash, read, write, edit, glob, todo_write, task, load_skill (8) | 8 + compact (9) |
+| 循环 | LLM 调用 → 工具执行 | 每轮前跑三层预处理器 + 阈值触发 compact_history |
+| 设计原则 | — | 便宜的先跑，贵的后跑 |
+
+---
+
+## 试一下
+
+```sh
+cd learn-claude-code
+python s08_context_compact/code.py
+```
+
+试试这些 prompt：
+
+1. `Read the file README.md, then read code.py, then read s01_agent_loop/README.md`（连续读多个文件，观察 L2 压缩旧结果）
+2. `Read every file in s08_context_compact/`（一次性读大量内容，观察 L3 落盘）
+3. 反复对话 20+ 轮，观察是否出现 `[auto compact]` 或 `[reactive compact]`
+
+观察重点：每次工具执行后，旧 tool_result 是否被压缩？连续对话后 token 超阈值时，是否自动触发了摘要？
+
+---
+
+## 接下来
+
+上下文压缩让 Agent 能跑很久不会崩。但每次压缩后，用户之前告诉它的偏好、约束也跟着丢了。能不能让 Agent 有选择地记住重要的事？
+
+s09 Memory → 三个子系统：选择记什么、提取关键信息、整理巩固。跨压缩、跨会话。
+
+<details>
+<summary>深入 CC 源码</summary>
+
+> 以下基于 CC 源码 `compact.ts`、`autoCompact.ts`、`microCompact.ts`、`query.ts` 的分析。
+
+### 执行顺序对照
+
+教学版为了讲解方便按 L1/L2/L3/L4 编号，但实际执行顺序和编号不完全对应：
+
+| 维度 | 教学版 | Claude Code |
+|------|--------|-------------|
+| 执行顺序 | budget → snip → micro → auto | budget → snip → micro → collapse → auto（`query.ts:379-468`） |
+| snip_compact | 保留头 3 + 尾 47 | CC 仅主线程启用；实现不在开源仓库中（`HISTORY_SNIP` feature gate），但接口可见：`snipCompactIfNeeded(messages)` → `{ messages, tokensFreed, boundaryMessage? }`，还暴露了 `SnipTool` 工具让模型主动调用。教学版的 3/47 是简化参数 |
+| micro_compact | 文本占位符替换 | 两条路径：time-based 直接清内容，cached 走 API `cache_edits`（legacy path 已移除） |
+| micro_compact 白名单 | 按位置（最近 3 条） | time-based 按时间阈值触发；cached 按计数触发（`microCompact.ts`） |
+| tool_result_budget | 200KB 字符 | 200,000 字符（`toolLimits.ts:49`） |
+| compact_history 阈值 | 字符数估算 | 精确 token：`contextWindow - maxOutputTokens - 13_000` |
+| 摘要要求 | 5 类信息 | 9 个部分 + `<analysis>`/`<summary>` 双标签 |
+| 压缩 prompt | 简单 prompt | 首尾双重防呆禁止调工具 |
+| PTL retry | 有（简化） | `truncateHeadForPTLRetry()` 按消息组回退（`compact.ts:243-290`） |
+| 后压缩恢复 | 无（教学版只保留摘要） | 自动重新读取最近文件、计划、agent/skill/tool 等 |
+| 熔断器 | 3 次 | 3 次（`autoCompact.ts:70`） |
+| reactive 重试 | 1 次 | CC 有更精细的分级重试 |
+
+### 执行顺序详解
+
+CC 源码 `query.ts` 中的真实顺序：
+
+1. `applyToolResultBudget`（L379）：先处理大结果，确保完整内容落盘
+2. `snipCompact`（L403）：裁中间消息
+3. `microcompact`（L414）：旧结果占位
+4. `contextCollapse`（L441）：独立的上下文管理系统（教学版无）
+5. `autoCompact`（L454）：LLM 全量摘要
+
+教学版的 budget → snip → micro 顺序与此一致。教学版没有 contextCollapse 机制。
+
+### 完整常量参考
+
+| 常量 | 值 | 源文件 |
+|------|-----|--------|
+| `AUTOCOMPACT_BUFFER_TOKENS` | 13,000 | `autoCompact.ts:62` |
+| `MAX_CONSECUTIVE_AUTOCOMPACT_FAILURES` | 3 | `autoCompact.ts:70` |
+| `MAX_OUTPUT_TOKENS_FOR_SUMMARY` | 20,000 | `autoCompact.ts:30` |
+| `POST_COMPACT_TOKEN_BUDGET` | 50,000 | `compact.ts:123` |
+| `POST_COMPACT_MAX_FILES_TO_RESTORE` | 5 | `compact.ts:122` |
+| `POST_COMPACT_MAX_TOKENS_PER_FILE` | 5,000 | `compact.ts:124` |
+| 时间 micro_compact 间隔 | 60 分钟 | `timeBasedMCConfig.ts` |
+| `MAX_COMPACT_STREAMING_RETRIES` | 2 | `compact.ts:131` |
+
+### contextCollapse 和 sessionMemoryCompact
+
+CC 源码中还有两个机制本教学版没有展开：
+
+- **contextCollapse**：独立的上下文管理系统，启用时抑制 proactive autocompact（`autoCompact.ts:215-222`），由 collapse 的 commit/blocking 流程接管上下文管理。但 manual `/compact` 和 reactive fallback 仍是独立路径，不受 contextCollapse 影响。
+- **sessionMemoryCompact**：compact_history 之前，CC 会先尝试用已有的 session memory（s09 会讲到）做轻量摘要，不调 LLM。这个机制等学完 s09 之后回头看会更清楚。
+
+### 压缩 prompt 长什么样？
+
+CC 的压缩 prompt 有两个硬性要求：
+
+1. **绝对禁止调用工具**：开头就是 `CRITICAL: Respond with TEXT ONLY. Do NOT call any tools.`，末尾还会再 REMINDER 一次
+2. **先分析再总结**：模型需要先在 `<analysis>` 标签里理清思路，然后在 `<summary>` 标签里输出正式摘要。analysis 在格式化时被剥离
+
+### 教学版的简化是刻意的
+
+- micro_compact 用文本占位 → 我们没有 API 层的 `cache_edits` 权限
+- token 用字符数估算 → 精确 tokenizer 不在教学范围内
+- 后压缩恢复省略 → 教学版只保留摘要，不自动重新附加文件
+- 两个辅助机制不展开 → 属于 10% 的细节
+
+核心设计思想，便宜的先跑贵的后跑，完整保留。
+
+</details>
+
+<!-- translation-sync: zh@v1, en@v1, ja@v1 -->
--- a/s08_context_compact/code.py
+++ b/s08_context_compact/code.py
@@ -0,0 +1,469 @@
+#!/usr/bin/env python3
+"""
+s08_context_compact.py - Context Compact
+
+Four-layer compaction pipeline inserted before LLM calls:
+
+    L1: snip_compact      — trim middle messages when count > 50
+    L2: micro_compact     — replace old tool_results with placeholders
+    L3: tool_result_budget — persist large results to disk
+    L4: compact_history   — LLM full summary (1 API call)
+
+    Emergency: reactive_compact — when API still returns prompt_too_long
+
+    ┌─────────────────────────────────────────────────────────────┐
+    │  messages[]                                                 │
+    │    ↓                                                        │
+    │  L3 budget ─→ L1 snip ─→ L2 micro ─→ [token > threshold?]  │
+    │                                      ├─ No  → LLM          │
+    │                                      └─ Yes → L4 summary   │
+    │                                              ↓              │
+    │                                          LLM call           │
+    │                                    [prompt_too_long?]        │
+    │                                      └─ Yes → reactive      │
+    └─────────────────────────────────────────────────────────────┘
+
+Core principle: cheap first, expensive last.
+Execution order matches CC source: budget → snip → micro → auto.
+
+Builds on s07 (skill loading). Usage:
+
+    python s08_context_compact/code.py
+    Needs: pip install anthropic python-dotenv + ANTHROPIC_API_KEY in .env
+"""
+
+import os, subprocess, json, time
+from pathlib import Path
+
+try:
+    import readline
+    readline.parse_and_bind('set bind-tty-special-chars off')
+except ImportError:
+    pass
+
+from anthropic import Anthropic
+from dotenv import load_dotenv
+
+load_dotenv(override=True)
+if os.getenv("ANTHROPIC_BASE_URL"): os.environ.pop("ANTHROPIC_AUTH_TOKEN", None)
+
+WORKDIR = Path.cwd()
+SKILLS_DIR = WORKDIR / "skills"
+TRANSCRIPT_DIR = WORKDIR / ".transcripts"
+TOOL_RESULTS_DIR = WORKDIR / ".task_outputs" / "tool-results"
+TASKS_DIR = WORKDIR / ".tasks"; TASKS_DIR.mkdir(exist_ok=True)
+client = Anthropic(base_url=os.getenv("ANTHROPIC_BASE_URL"))
+MODEL = os.environ["MODEL_ID"]
+
+# s07: Skill catalog scan (inherited from s07)
+def _parse_frontmatter(text: str) -> tuple[dict, str]:
+    if not text.startswith("---"):
+        return {}, text
+    parts = text.split("---", 2)
+    if len(parts) < 3:
+        return {}, text
+    meta = {}
+    for line in parts[1].strip().splitlines():
+        if ":" in line:
+            k, v = line.split(":", 1)
+            meta[k.strip()] = v.strip().strip('"').strip("'")
+    return meta, parts[2].strip()
+
+SKILL_REGISTRY: dict[str, dict] = {}
+
+def _scan_skills():
+    if not SKILLS_DIR.exists():
+        return
+    for d in sorted(SKILLS_DIR.iterdir()):
+        if not d.is_dir():
+            continue
+        manifest = d / "SKILL.md"
+        if manifest.exists():
+            raw = manifest.read_text()
+            meta, body = _parse_frontmatter(raw)
+            name = meta.get("name", d.name)
+            desc = meta.get("description", raw.split("\n")[0].lstrip("#").strip())
+            SKILL_REGISTRY[name] = {"name": name, "description": desc, "content": raw}
+
+_scan_skills()
+
+def list_skills() -> str:
+    if not SKILL_REGISTRY:
+        return "(no skills found)"
+    return "\n".join(f"- **{s['name']}**: {s['description']}" for s in SKILL_REGISTRY.values())
+
+def load_skill(name: str) -> str:
+    skill = SKILL_REGISTRY.get(name)
+    if not skill:
+        return f"Skill not found: {name}"
+    return skill["content"]
+
+# s08: SYSTEM includes skill catalog (inherited from s07 build_system)
+def build_system() -> str:
+    catalog = list_skills()
+    return (
+        f"You are a coding agent at {WORKDIR}. "
+        f"Skills available:\n{catalog}\n"
+        "Use load_skill to get full details when needed."
+    )
+
+SYSTEM = build_system()
+
+# s08: subagent gets its own system prompt — no compact, no skill loading
+SUB_SYSTEM = (
+    f"You are a coding agent at {WORKDIR}. "
+    "Complete the task you were given, then return a concise summary. "
+    "Do not delegate further."
+)
+
+
+# ═══════════════════════════════════════════════════════════
+#  FROM s02-s07 (unchanged): Basic Tools
+# ═══════════════════════════════════════════════════════════
+
+def safe_path(p: str) -> Path:
+    path = (WORKDIR / p).resolve()
+    if not path.is_relative_to(WORKDIR): raise ValueError(f"Path escapes workspace: {p}")
+    return path
+
+def run_bash(command: str) -> str:
+    try:
+        r = subprocess.run(command, shell=True, cwd=WORKDIR, capture_output=True, text=True, timeout=120)
+        out = (r.stdout + r.stderr).strip()
+        return out[:50000] if out else "(no output)"
+    except subprocess.TimeoutExpired: return "Error: Timeout (120s)"
+
+def run_read(path: str, limit: int | None = None) -> str:
+    try:
+        lines = safe_path(path).read_text().splitlines()
+        if limit and limit < len(lines): lines = lines[:limit] + [f"... ({len(lines) - limit} more lines)"]
+        return "\n".join(lines)
+    except Exception as e: return f"Error: {e}"
+
+def run_write(path: str, content: str) -> str:
+    try:
+        file_path = safe_path(path); file_path.parent.mkdir(parents=True, exist_ok=True)
+        file_path.write_text(content); return f"Wrote {len(content)} bytes to {path}"
+    except Exception as e: return f"Error: {e}"
+
+def run_edit(path: str, old_text: str, new_text: str) -> str:
+    try:
+        file_path = safe_path(path)
+        text = file_path.read_text()
+        if old_text not in text: return f"Error: text not found in {path}"
+        file_path.write_text(text.replace(old_text, new_text, 1))
+        return f"Edited {path}"
+    except Exception as e: return f"Error: {e}"
+
+def run_glob(pattern: str) -> str:
+    import glob as g
+    try:
+        results = []
+        for match in g.glob(pattern, root_dir=WORKDIR):
+            if (WORKDIR / match).resolve().is_relative_to(WORKDIR):
+                results.append(match)
+        return "\n".join(results) if results else "(no matches)"
+    except Exception as e: return f"Error: {e}"
+
+def run_todo_write(todos: list) -> str:
+    for i, t in enumerate(todos):
+        if "content" not in t or "status" not in t:
+            return f"Error: todos[{i}] missing 'content' or 'status'"
+        if t["status"] not in ("pending", "in_progress", "completed"):
+            return f"Error: todos[{i}] has invalid status '{t['status']}'"
+    tasks_file = TASKS_DIR / "current_todos.json"
+    tasks_file.write_text(json.dumps(todos, indent=2, ensure_ascii=False))
+    lines = ["\n\033[33m## Current Tasks\033[0m"]
+    for t in todos:
+        icon = {"pending": " ", "in_progress": "\033[36m▸\033[0m", "completed": "\033[32m✓\033[0m"}[t["status"]]
+        lines.append(f"  [{icon}] {t['content']}")
+    print("\n".join(lines))
+    return f"Updated {len(todos)} tasks"
+
+def extract_text(content) -> str:
+    if not isinstance(content, list): return str(content)
+    return "\n".join(getattr(b, "text", "") for b in content if getattr(b, "type", None) == "text")
+
+
+# ═══════════════════════════════════════════════════════════
+#  FROM s06-s07 (unchanged): Subagent
+# ═══════════════════════════════════════════════════════════
+
+SUB_TOOLS = [
+    {"name": "bash", "description": "Run a shell command.",
+     "input_schema": {"type": "object", "properties": {"command": {"type": "string"}}, "required": ["command"]}},
+    {"name": "read_file", "description": "Read file contents.",
+     "input_schema": {"type": "object", "properties": {"path": {"type": "string"}}, "required": ["path"]}},
+    {"name": "write_file", "description": "Write content to a file.",
+     "input_schema": {"type": "object", "properties": {"path": {"type": "string"}, "content": {"type": "string"}}, "required": ["path", "content"]}},
+    {"name": "edit_file", "description": "Replace exact text in a file once.",
+     "input_schema": {"type": "object", "properties": {"path": {"type": "string"}, "old_text": {"type": "string"}, "new_text": {"type": "string"}}, "required": ["path", "old_text", "new_text"]}},
+    {"name": "glob", "description": "Find files matching a glob pattern.",
+     "input_schema": {"type": "object", "properties": {"pattern": {"type": "string"}}, "required": ["pattern"]}},
+]
+SUB_HANDLERS = {"bash": run_bash, "read_file": run_read, "write_file": run_write,
+                "edit_file": run_edit, "glob": run_glob}
+
+def spawn_subagent(task: str) -> str:
+    print(f"\n\033[35m[Subagent spawned]\033[0m")
+    messages = [{"role": "user", "content": task}]
+    for _ in range(30):
+        response = client.messages.create(model=MODEL, system=SUB_SYSTEM,
+            messages=messages, tools=SUB_TOOLS, max_tokens=8000)
+        messages.append({"role": "assistant", "content": response.content})
+        if response.stop_reason != "tool_use":
+            break
+        results = []
+        for block in response.content:
+            if block.type == "tool_use":
+                blocked = trigger_hooks("PreToolUse", block)
+                if blocked:
+                    results.append({"type": "tool_result", "tool_use_id": block.id,
+                                    "content": str(blocked)})
+                    continue
+                handler = SUB_HANDLERS.get(block.name)
+                output = handler(**block.input) if handler else f"Unknown: {block.name}"
+                trigger_hooks("PostToolUse", block, output)
+                print(f"  \033[90m[sub] {block.name}: {str(output)[:100]}\033[0m")
+                results.append({"type": "tool_result", "tool_use_id": block.id, "content": output})
+        messages.append({"role": "user", "content": results})
+    result = extract_text(messages[-1]["content"])
+    if not result:
+        for msg in reversed(messages):
+            if msg["role"] == "assistant":
+                result = extract_text(msg["content"])
+                if result:
+                    break
+        if not result:
+            result = "Subagent stopped after 30 turns without final answer."
+    print(f"\033[35m[Subagent done]\033[0m")
+    return result
+
+
+# ═══════════════════════════════════════════════════════════
+#  NEW in s08: Four-Layer Compaction Pipeline
+# ═══════════════════════════════════════════════════════════
+
+CONTEXT_LIMIT = 50000
+KEEP_RECENT = 3
+PERSIST_THRESHOLD = 30000
+
+def estimate_size(msgs): return len(str(msgs))
+
+
+# L1: snipCompact — trim middle messages
+def snip_compact(messages, max_messages=50):
+    if len(messages) <= max_messages: return messages
+    keep_head, keep_tail = 3, max_messages - 3
+    snipped = len(messages) - keep_head - keep_tail
+    return messages[:keep_head] + [{"role": "user", "content": f"[snipped {snipped} messages]"}] + messages[-keep_tail:]
+
+
+# L2: microCompact — old result placeholders
+def collect_tool_results(messages):
+    blocks = []
+    for mi, msg in enumerate(messages):
+        if msg.get("role") != "user" or not isinstance(msg.get("content"), list): continue
+        for bi, block in enumerate(msg["content"]):
+            if isinstance(block, dict) and block.get("type") == "tool_result":
+                blocks.append((mi, bi, block))
+    return blocks
+
+def micro_compact(messages):
+    tool_results = collect_tool_results(messages)
+    if len(tool_results) <= KEEP_RECENT: return messages
+    for _, _, block in tool_results[:-KEEP_RECENT]:
+        if len(block.get("content", "")) > 120:
+            block["content"] = "[Earlier tool result compacted. Re-run if needed.]"
+    return messages
+
+
+# L3: toolResultBudget — persist large results to disk
+def persist_large_output(tool_use_id, output):
+    if len(output) <= PERSIST_THRESHOLD: return output
+    TOOL_RESULTS_DIR.mkdir(parents=True, exist_ok=True)
+    path = TOOL_RESULTS_DIR / f"{tool_use_id}.txt"
+    if not path.exists(): path.write_text(output)
+    return f"<persisted-output>\nFull output: {path}\nPreview:\n{output[:2000]}\n</persisted-output>"
+
+def tool_result_budget(messages, max_bytes=200_000):
+    last = messages[-1] if messages else None
+    if not last or last.get("role") != "user" or not isinstance(last.get("content"), list): return messages
+    blocks = [(i, b) for i, b in enumerate(last["content"]) if isinstance(b, dict) and b.get("type") == "tool_result"]
+    total = sum(len(str(b.get("content", ""))) for _, b in blocks)
+    if total <= max_bytes: return messages
+    ranked = sorted(blocks, key=lambda p: len(str(p[1].get("content", ""))), reverse=True)
+    for _, block in ranked:
+        if total <= max_bytes: break
+        content = str(block.get("content", ""))
+        if len(content) <= PERSIST_THRESHOLD: continue
+        tid = block.get("tool_use_id", "unknown")
+        block["content"] = persist_large_output(tid, content)
+        total = sum(len(str(b.get("content", ""))) for _, b in blocks)
+    return messages
+
+
+# L4: autoCompact — LLM full summary
+def write_transcript(messages):
+    TRANSCRIPT_DIR.mkdir(parents=True, exist_ok=True)
+    path = TRANSCRIPT_DIR / f"transcript_{int(time.time())}.jsonl"
+    with path.open("w") as f:
+        for msg in messages: f.write(json.dumps(msg, default=str) + "\n")
+    return path
+
+def summarize_history(messages):
+    conversation = json.dumps(messages, default=str)[:80000]
+    prompt = ("Summarize this coding-agent conversation so work can continue.\n"
+              "Preserve: 1. current goal, 2. key findings/decisions, 3. files read/changed, "
+              "4. remaining work, 5. user constraints.\nBe compact but concrete.\n\n" + conversation)
+    response = client.messages.create(model=MODEL, messages=[{"role": "user", "content": prompt}], max_tokens=2000)
+    return "\n".join(
+        getattr(block, "text", "")
+        for block in response.content
+        if getattr(block, "type", None) == "text").strip() or "(empty summary)"
+
+def compact_history(messages):
+    transcript_path = write_transcript(messages)
+    print(f"[transcript saved: {transcript_path}]")
+    summary = summarize_history(messages)
+    return [{"role": "user", "content": f"[Compacted]\n\n{summary}"}]
+
+
+# Emergency: reactiveCompact — on API error
+def reactive_compact(messages):
+    transcript = write_transcript(messages)
+    summary = summarize_history(messages)
+    return [{"role": "user", "content": f"[Reactive compact]\n\n{summary}"}, *messages[-5:]]
+
+
+# ═══════════════════════════════════════════════════════════
+#  FROM s07: Tool Definitions
+# ═══════════════════════════════════════════════════════════
+
+TOOLS = [
+    {"name": "bash", "description": "Run a shell command.",
+     "input_schema": {"type": "object", "properties": {"command": {"type": "string"}}, "required": ["command"]}},
+    {"name": "read_file", "description": "Read file contents.",
+     "input_schema": {"type": "object", "properties": {"path": {"type": "string"}, "limit": {"type": "integer"}}, "required": ["path"]}},
+    {"name": "write_file", "description": "Write content to a file.",
+     "input_schema": {"type": "object", "properties": {"path": {"type": "string"}, "content": {"type": "string"}}, "required": ["path", "content"]}},
+    {"name": "edit_file", "description": "Replace exact text in a file once.",
+     "input_schema": {"type": "object", "properties": {"path": {"type": "string"}, "old_text": {"type": "string"}, "new_text": {"type": "string"}}, "required": ["path", "old_text", "new_text"]}},
+    {"name": "glob", "description": "Find files matching a glob pattern.",
+     "input_schema": {"type": "object", "properties": {"pattern": {"type": "string"}}, "required": ["pattern"]}},
+    {"name": "todo_write", "description": "Create and manage a task list for your current coding session.",
+     "input_schema": {"type": "object", "properties": {"todos": {"type": "array", "items": {"type": "object", "properties": {"content": {"type": "string"}, "status": {"type": "string", "enum": ["pending", "in_progress", "completed"]}}, "required": ["content", "status"]}}}, "required": ["todos"]}},
+    {"name": "task", "description": "Launch a subagent to handle a complex subtask. Returns only the final conclusion.",
+     "input_schema": {"type": "object", "properties": {"description": {"type": "string"}}, "required": ["description"]}},
+    {"name": "load_skill", "description": "Load the full content of a skill by name.",
+     "input_schema": {"type": "object", "properties": {"name": {"type": "string"}}, "required": ["name"]}},
+    # s08 change: new compact tool — triggers compact_history, not a no-op
+    {"name": "compact", "description": "Summarize earlier conversation to free context space.",
+     "input_schema": {"type": "object", "properties": {"focus": {"type": "string"}}}},
+]
+
+TOOL_HANDLERS = {
+    "bash": run_bash, "read_file": run_read, "write_file": run_write,
+    "edit_file": run_edit, "glob": run_glob, "todo_write": run_todo_write,
+    "task": spawn_subagent, "load_skill": load_skill,
+}
+
+# FROM s04 (unchanged): Hooks
+HOOKS = {"PreToolUse": [], "PostToolUse": []}
+def trigger_hooks(event, *args):
+    for cb in HOOKS[event]:
+        r = cb(*args)
+        if r is not None: return r
+    return None
+
+DENY_LIST = ["rm -rf /", "sudo", "shutdown"]
+def permission_hook(block):
+    if block.name == "bash":
+        for p in DENY_LIST:
+            if p in block.input.get("command", ""): return "Permission denied"
+    return None
+def log_hook(block):
+    print(f"\033[90m[HOOK] {block.name}\033[0m")
+    return None
+
+HOOKS["PreToolUse"].append(permission_hook)
+HOOKS["PreToolUse"].append(log_hook)
+
+
+# ═══════════════════════════════════════════════════════════
+#  agent_loop — s08 core: run compaction pipeline before LLM
+# ═══════════════════════════════════════════════════════════
+
+MAX_REACTIVE_RETRIES = 1  # retry limit for reactive compact
+
+def agent_loop(messages: list):
+    reactive_retries = 0
+    while True:
+        # s08 change: three preprocessors (0 API calls, cheap first)
+        # Order matches CC source: budget → snip → micro
+        messages[:] = tool_result_budget(messages)    # L3: persist large results first
+        messages[:] = snip_compact(messages)          # L1: trim middle
+        messages[:] = micro_compact(messages)         # L2: old result placeholders
+
+        # s08 change: tokens still over threshold → LLM summary (1 API call)
+        if estimate_size(messages) > CONTEXT_LIMIT:
+            print("[auto compact]")
+            messages[:] = compact_history(messages)
+
+        try:
+            response = client.messages.create(model=MODEL, system=SYSTEM, messages=messages, tools=TOOLS, max_tokens=8000)
+            reactive_retries = 0  # reset on successful API call
+        except Exception as e:
+            if ("prompt_too_long" in str(e).lower() or "too many tokens" in str(e).lower()) and reactive_retries < MAX_REACTIVE_RETRIES:
+                print("[reactive compact]")
+                messages[:] = reactive_compact(messages)
+                reactive_retries += 1
+                continue
+            raise
+
+        messages.append({"role": "assistant", "content": response.content})
+        if response.stop_reason != "tool_use": return
+
+        results = []
+        for block in response.content:
+            if block.type != "tool_use": continue
+            print(f"\033[36m> {block.name}\033[0m")
+
+            # s08: compact tool triggers compact_history, not a no-op string
+            if block.name == "compact":
+                messages[:] = compact_history(messages)
+                results.append({"type": "tool_result", "tool_use_id": block.id,
+                                "content": "[Compacted. Conversation history has been summarized.]"})
+                messages.append({"role": "user", "content": results})
+                break  # end current turn, start fresh with compacted context
+
+            blocked = trigger_hooks("PreToolUse", block)
+            if blocked:
+                results.append({"type": "tool_result", "tool_use_id": block.id, "content": str(blocked)})
+                continue
+            handler = TOOL_HANDLERS.get(block.name)
+            output = handler(**block.input) if handler else f"Unknown: {block.name}"
+            trigger_hooks("PostToolUse", block, output)
+            print(str(output)[:200])
+            results.append({"type": "tool_result", "tool_use_id": block.id, "content": str(output)})
+        else:
+            # normal path: no compact was called
+            messages.append({"role": "user", "content": results})
+            continue
+        # compact was called: results already appended above
+        continue
+
+
+if __name__ == "__main__":
+    print("s08: Context Compact — four-layer compaction pipeline")
+    print("输入问题，回车发送。输入 q 退出。\n")
+    history = []
+    while True:
+        try: query = input("\033[36ms08 >> \033[0m")
+        except (EOFError, KeyboardInterrupt): break
+        if query.strip().lower() in ("q", "exit", ""): break
+        history.append({"role": "user", "content": query})
+        agent_loop(history)
+        for block in history[-1]["content"]:
+            if getattr(block, "type", None) == "text": print(block.text)
+        print()
--- a/s08_context_compact/images/auto-compact.en.svg
+++ b/s08_context_compact/images/auto-compact.en.svg
@@ -0,0 +1,72 @@
+<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 720 400" font-family="system-ui, -apple-system, sans-serif">
+  <defs>
+    <linearGradient id="header" x1="0" y1="0" x2="1" y2="0">
+      <stop offset="0%" stop-color="#991b1b"/><stop offset="100%" stop-color="#dc2626"/>
+    </linearGradient>
+    <marker id="arrow" viewBox="0 0 10 10" refX="10" refY="5" markerWidth="6" markerHeight="6" orient="auto-start-reverse">
+      <path d="M 0 0 L 10 5 L 0 10 z" fill="#dc2626"/>
+    </marker>
+  </defs>
+
+  <rect width="720" height="400" fill="#fafbfc" rx="8"/>
+  <rect x="0" y="0" width="720" height="38" fill="url(#header)" rx="8"/>
+  <rect x="0" y="30" width="720" height="8" fill="url(#header)"/>
+  <text x="360" y="25" fill="#fff" font-size="14" font-weight="700" text-anchor="middle">L4: autoCompact — LLM Full Summary</text>
+
+  <!-- Trigger Condition -->
+  <rect x="20" y="54" width="680" height="44" rx="6" fill="#fef2f2" stroke="#fca5a5" stroke-width="1"/>
+  <text x="35" y="70" fill="#991b1b" font-size="11" font-weight="600">Trigger Condition</text>
+  <text x="140" y="70" fill="#991b1b" font-size="11">All three preprocessing layers have run, estimated tokens &gt; contextWindow - maxOutputTokens - 13_000.</text>
+  <text x="140" y="86" fill="#991b1b" font-size="10">Tries sessionMemoryCompact first (lightweight summary from existing memory), only calls LLM if insufficient.</text>
+
+  <!-- Steps -->
+  <rect x="20" y="106" width="200" height="110" rx="8" fill="#fff" stroke="#94a3b8" stroke-width="1.5"/>
+  <text x="120" y="130" fill="#1e3a5f" font-size="12" font-weight="700" text-anchor="middle">Step 1: Save transcript</text>
+  <text x="40" y="152" fill="#475569" font-size="10">Write full conversation to .transcripts/</text>
+  <text x="40" y="168" fill="#475569" font-size="10">JSONL format, one message per line</text>
+  <text x="40" y="184" fill="#475569" font-size="10">Filename: transcript_{timestamp}.jsonl</text>
+  <text x="40" y="200" fill="#94a3b8" font-size="9">No data lost, just moved out of active area</text>
+
+  <line x1="225" y1="161" x2="265" y2="161" stroke="#dc2626" stroke-width="2" marker-end="url(#arrow)"/>
+
+  <rect x="270" y="106" width="200" height="110" rx="8" fill="#fff" stroke="#94a3b8" stroke-width="1.5"/>
+  <text x="370" y="130" fill="#1e3a5f" font-size="12" font-weight="700" text-anchor="middle">Step 2: LLM generates summary</text>
+  <text x="290" y="152" fill="#475569" font-size="10">Send conversation history to LLM</text>
+  <text x="290" y="166" fill="#475569" font-size="9">Summary must include 9 sections:</text>
+  <text x="290" y="180" fill="#94a3b8" font-size="8">request · concepts · files · errors · resolutions</text>
+  <text x="290" y="192" fill="#94a3b8" font-size="8">user messages · todos · current state · next steps</text>
+  <text x="290" y="206" fill="#94a3b8" font-size="9">Generated only once</text>
+
+  <line x1="475" y1="161" x2="515" y2="161" stroke="#dc2626" stroke-width="2" marker-end="url(#arrow)"/>
+
+  <rect x="520" y="106" width="180" height="110" rx="8" fill="#fef2f2" stroke="#dc2626" stroke-width="2"/>
+  <text x="610" y="130" fill="#991b1b" font-size="12" font-weight="700" text-anchor="middle">Step 3: Replace message list</text>
+  <text x="540" y="152" fill="#991b1b" font-size="10">All old messages → 1 summary</text>
+  <text x="540" y="168" fill="#991b1b" font-size="10">Model continues from summary</text>
+  <text x="540" y="184" fill="#991b1b" font-size="10">Includes recently_read file list</text>
+  <text x="540" y="200" fill="#ef4444" font-size="9">⚠ This is an irreversible operation</text>
+
+  <!-- Before/After comparison -->
+  <rect x="20" y="234" width="320" height="94" rx="6" fill="#fff" stroke="#94a3b8" stroke-width="1"/>
+  <text x="180" y="256" fill="#64748b" font-size="11" font-weight="600" text-anchor="middle">Before messages</text>
+  <rect x="35" y="264" width="52" height="16" rx="3" fill="#e2e8f0"/><text x="40" y="276" fill="#475569" font-size="8">user</text>
+  <rect x="92" y="264" width="52" height="16" rx="3" fill="#e2e8f0"/><text x="97" y="276" fill="#475569" font-size="8">assistant</text>
+  <rect x="149" y="264" width="52" height="16" rx="3" fill="#e2e8f0"/><text x="154" y="276" fill="#475569" font-size="8">user</text>
+  <rect x="206" y="264" width="52" height="16" rx="3" fill="#e2e8f0"/><text x="211" y="276" fill="#475569" font-size="8">assistant</text>
+  <rect x="263" y="264" width="52" height="16" rx="3" fill="#e2e8f0"/><text x="268" y="276" fill="#475569" font-size="8">user</text>
+  <text x="180" y="318" fill="#94a3b8" font-size="9" text-anchor="middle">~180 messages, occupying 62K tokens</text>
+
+  <line x1="345" y1="281" x2="375" y2="281" stroke="#dc2626" stroke-width="2" marker-end="url(#arrow)"/>
+
+  <rect x="380" y="234" width="320" height="94" rx="6" fill="#fef2f2" stroke="#dc2626" stroke-width="1"/>
+  <text x="540" y="256" fill="#991b1b" font-size="11" font-weight="600" text-anchor="middle">After messages</text>
+  <rect x="395" y="264" width="290" height="32" rx="4" fill="#fee2e2" stroke="#fca5a5" stroke-width="0.5"/>
+  <text x="540" y="276" fill="#991b1b" font-size="9" text-anchor="middle">[Compacted] Summary: goal → create hello.py ...</text>
+  <text x="540" y="290" fill="#991b1b" font-size="9" text-anchor="middle">Recent files: hello.py, README.md ...</text>
+  <text x="540" y="318" fill="#94a3b8" font-size="9" text-anchor="middle">~1 message, occupying 1K tokens</text>
+
+  <!-- Circuit breaker -->
+  <rect x="20" y="340" width="680" height="36" rx="6" fill="#f8fafc" stroke="#cbd5e1" stroke-width="1"/>
+  <text x="35" y="362" fill="#475569" font-size="11" font-weight="600">Circuit breaker:</text>
+  <text x="130" y="362" fill="#475569" font-size="10">3 consecutive autocompact failures → stop retrying. Prevents wasting API calls when context is unrecoverable.</text>
+</svg>
--- a/s08_context_compact/images/auto-compact.ja.svg
+++ b/s08_context_compact/images/auto-compact.ja.svg
@@ -0,0 +1,72 @@
+<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 720 400" font-family="system-ui, -apple-system, sans-serif">
+  <defs>
+    <linearGradient id="header" x1="0" y1="0" x2="1" y2="0">
+      <stop offset="0%" stop-color="#991b1b"/><stop offset="100%" stop-color="#dc2626"/>
+    </linearGradient>
+    <marker id="arrow" viewBox="0 0 10 10" refX="10" refY="5" markerWidth="6" markerHeight="6" orient="auto-start-reverse">
+      <path d="M 0 0 L 10 5 L 0 10 z" fill="#dc2626"/>
+    </marker>
+  </defs>
+
+  <rect width="720" height="400" fill="#fafbfc" rx="8"/>
+  <rect x="0" y="0" width="720" height="38" fill="url(#header)" rx="8"/>
+  <rect x="0" y="30" width="720" height="8" fill="url(#header)"/>
+  <text x="360" y="25" fill="#fff" font-size="14" font-weight="700" text-anchor="middle">L4: autoCompact — LLM 完全要約</text>
+
+  <!-- トリガー条件 -->
+  <rect x="20" y="54" width="680" height="44" rx="6" fill="#fef2f2" stroke="#fca5a5" stroke-width="1"/>
+  <text x="35" y="70" fill="#991b1b" font-size="11" font-weight="600">トリガー条件</text>
+  <text x="115" y="70" fill="#991b1b" font-size="11">前 3 層の前処理を全て実行後、推定 token &gt; contextWindow - maxOutputTokens - 13_000。</text>
+  <text x="115" y="86" fill="#991b1b" font-size="10">まず sessionMemoryCompact を試行（既存のメモリで軽量要約）、不足時のみ LLM を呼び出し。</text>
+
+  <!-- ステップ -->
+  <rect x="20" y="106" width="200" height="110" rx="8" fill="#fff" stroke="#94a3b8" stroke-width="1.5"/>
+  <text x="120" y="130" fill="#1e3a5f" font-size="12" font-weight="700" text-anchor="middle">ステップ 1：transcript 保存</text>
+  <text x="40" y="152" fill="#475569" font-size="10">完全な対話を .transcripts/ に書き込み</text>
+  <text x="40" y="168" fill="#475569" font-size="10">JSONL 形式、1 行 1 メッセージ</text>
+  <text x="40" y="184" fill="#475569" font-size="10">ファイル名：transcript_{timestamp}.jsonl</text>
+  <text x="40" y="200" fill="#94a3b8" font-size="9">情報は失われていない、アクティブ領域から移動のみ</text>
+
+  <line x1="225" y1="161" x2="265" y2="161" stroke="#dc2626" stroke-width="2" marker-end="url(#arrow)"/>
+
+  <rect x="270" y="106" width="200" height="110" rx="8" fill="#fff" stroke="#94a3b8" stroke-width="1.5"/>
+  <text x="370" y="130" fill="#1e3a5f" font-size="12" font-weight="700" text-anchor="middle">ステップ 2：LLM 要約生成</text>
+  <text x="290" y="152" fill="#475569" font-size="10">対話履歴を LLM に送信</text>
+  <text x="290" y="166" fill="#475569" font-size="9">要約は 9 つのセクションを含む：</text>
+  <text x="290" y="180" fill="#94a3b8" font-size="8">リクエスト・概念・ファイル・エラー・解決</text>
+  <text x="290" y="192" fill="#94a3b8" font-size="8">ユーザーメッセージ・TODO・現在・次ステップ</text>
+  <text x="290" y="206" fill="#94a3b8" font-size="9">1 回のみ生成</text>
+
+  <line x1="475" y1="161" x2="515" y2="161" stroke="#dc2626" stroke-width="2" marker-end="url(#arrow)"/>
+
+  <rect x="520" y="106" width="180" height="110" rx="8" fill="#fef2f2" stroke="#dc2626" stroke-width="2"/>
+  <text x="610" y="130" fill="#991b1b" font-size="12" font-weight="700" text-anchor="middle">ステップ 3：メッセージリスト置換</text>
+  <text x="540" y="152" fill="#991b1b" font-size="10">全旧メッセージ → 1 件の要約に</text>
+  <text x="540" y="168" fill="#991b1b" font-size="10">モデルは要約から作業を継続</text>
+  <text x="540" y="184" fill="#991b1b" font-size="10">recently_read ファイルリストを付与</text>
+  <text x="540" y="200" fill="#ef4444" font-size="9">⚠ これは復元不可能な操作</text>
+
+  <!-- 圧縮前/後 比較 -->
+  <rect x="20" y="234" width="320" height="94" rx="6" fill="#fff" stroke="#94a3b8" stroke-width="1"/>
+  <text x="180" y="256" fill="#64748b" font-size="11" font-weight="600" text-anchor="middle">圧縮前 messages</text>
+  <rect x="35" y="264" width="52" height="16" rx="3" fill="#e2e8f0"/><text x="40" y="276" fill="#475569" font-size="8">user</text>
+  <rect x="92" y="264" width="52" height="16" rx="3" fill="#e2e8f0"/><text x="97" y="276" fill="#475569" font-size="8">assistant</text>
+  <rect x="149" y="264" width="52" height="16" rx="3" fill="#e2e8f0"/><text x="154" y="276" fill="#475569" font-size="8">user</text>
+  <rect x="206" y="264" width="52" height="16" rx="3" fill="#e2e8f0"/><text x="211" y="276" fill="#475569" font-size="8">assistant</text>
+  <rect x="263" y="264" width="52" height="16" rx="3" fill="#e2e8f0"/><text x="268" y="276" fill="#475569" font-size="8">user</text>
+  <text x="180" y="318" fill="#94a3b8" font-size="9" text-anchor="middle">~180 件のメッセージ、62K トークンを占有</text>
+
+  <line x1="345" y1="281" x2="375" y2="281" stroke="#dc2626" stroke-width="2" marker-end="url(#arrow)"/>
+
+  <rect x="380" y="234" width="320" height="94" rx="6" fill="#fef2f2" stroke="#dc2626" stroke-width="1"/>
+  <text x="540" y="256" fill="#991b1b" font-size="11" font-weight="600" text-anchor="middle">圧縮後 messages</text>
+  <rect x="395" y="264" width="290" height="32" rx="4" fill="#fee2e2" stroke="#fca5a5" stroke-width="0.5"/>
+  <text x="540" y="276" fill="#991b1b" font-size="9" text-anchor="middle">[Compacted] 要約：目標 → hello.py を作成 ...</text>
+  <text x="540" y="290" fill="#991b1b" font-size="9" text-anchor="middle">最近のファイル：hello.py, README.md ...</text>
+  <text x="540" y="318" fill="#94a3b8" font-size="9" text-anchor="middle">~1 件のメッセージ、1K トークンを占有</text>
+
+  <!-- サーキットブレーカー -->
+  <rect x="20" y="340" width="680" height="36" rx="6" fill="#f8fafc" stroke="#cbd5e1" stroke-width="1"/>
+  <text x="35" y="362" fill="#475569" font-size="11" font-weight="600">サーキットブレーカー：</text>
+  <text x="145" y="362" fill="#475569" font-size="10">autocompact が連続 3 回失敗 → リトライ停止。コンテキストが復元不可能な場合の API 呼び出しの無駄な反復を防止。</text>
+</svg>
--- a/s08_context_compact/images/auto-compact.svg
+++ b/s08_context_compact/images/auto-compact.svg
@@ -0,0 +1,72 @@
+<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 720 400" font-family="system-ui, -apple-system, sans-serif">
+  <defs>
+    <linearGradient id="header" x1="0" y1="0" x2="1" y2="0">
+      <stop offset="0%" stop-color="#991b1b"/><stop offset="100%" stop-color="#dc2626"/>
+    </linearGradient>
+    <marker id="arrow" viewBox="0 0 10 10" refX="10" refY="5" markerWidth="6" markerHeight="6" orient="auto-start-reverse">
+      <path d="M 0 0 L 10 5 L 0 10 z" fill="#dc2626"/>
+    </marker>
+  </defs>
+
+  <rect width="720" height="400" fill="#fafbfc" rx="8"/>
+  <rect x="0" y="0" width="720" height="38" fill="url(#header)" rx="8"/>
+  <rect x="0" y="30" width="720" height="8" fill="url(#header)"/>
+  <text x="360" y="25" fill="#fff" font-size="14" font-weight="700" text-anchor="middle">L4: autoCompact — LLM 全量摘要</text>
+
+  <!-- 触发条件 -->
+  <rect x="20" y="54" width="680" height="44" rx="6" fill="#fef2f2" stroke="#fca5a5" stroke-width="1"/>
+  <text x="35" y="70" fill="#991b1b" font-size="11" font-weight="600">触发条件</text>
+  <text x="105" y="70" fill="#991b1b" font-size="11">前三层预处理全跑完，估算 token &gt; contextWindow - maxOutputTokens - 13_000。</text>
+  <text x="105" y="86" fill="#991b1b" font-size="10">先尝试 sessionMemoryCompact（用已有记忆做轻量摘要），不足才调 LLM。</text>
+
+  <!-- 步骤 -->
+  <rect x="20" y="106" width="200" height="110" rx="8" fill="#fff" stroke="#94a3b8" stroke-width="1.5"/>
+  <text x="120" y="130" fill="#1e3a5f" font-size="12" font-weight="700" text-anchor="middle">步骤 1：保存 transcript</text>
+  <text x="40" y="152" fill="#475569" font-size="10">完整对话写入 .transcripts/</text>
+  <text x="40" y="168" fill="#475569" font-size="10">JSONL 格式，一行一条消息</text>
+  <text x="40" y="184" fill="#475569" font-size="10">文件名：transcript_{timestamp}.jsonl</text>
+  <text x="40" y="200" fill="#94a3b8" font-size="9">信息没有丢失，只是移出活跃区</text>
+
+  <line x1="225" y1="161" x2="265" y2="161" stroke="#dc2626" stroke-width="2" marker-end="url(#arrow)"/>
+
+  <rect x="270" y="106" width="200" height="110" rx="8" fill="#fff" stroke="#94a3b8" stroke-width="1.5"/>
+  <text x="370" y="130" fill="#1e3a5f" font-size="12" font-weight="700" text-anchor="middle">步骤 2：LLM 生成摘要</text>
+  <text x="290" y="152" fill="#475569" font-size="10">把对话历史发给 LLM</text>
+  <text x="290" y="166" fill="#475569" font-size="9">摘要需包含 9 个部分：</text>
+  <text x="290" y="180" fill="#94a3b8" font-size="8">请求·概念·文件·错误·解决</text>
+  <text x="290" y="192" fill="#94a3b8" font-size="8">用户消息·待办·当前·下一步</text>
+  <text x="290" y="206" fill="#94a3b8" font-size="9">只生成一次</text>
+
+  <line x1="475" y1="161" x2="515" y2="161" stroke="#dc2626" stroke-width="2" marker-end="url(#arrow)"/>
+
+  <rect x="520" y="106" width="180" height="110" rx="8" fill="#fef2f2" stroke="#dc2626" stroke-width="2"/>
+  <text x="610" y="130" fill="#991b1b" font-size="12" font-weight="700" text-anchor="middle">步骤 3：替换消息列表</text>
+  <text x="540" y="152" fill="#991b1b" font-size="10">所有旧消息 → 1 条摘要</text>
+  <text x="540" y="168" fill="#991b1b" font-size="10">模型从摘要继续工作</text>
+  <text x="540" y="184" fill="#991b1b" font-size="10">附带 recently_read 文件列表</text>
+  <text x="540" y="200" fill="#ef4444" font-size="9">⚠ 这是无法恢复的操作</text>
+
+  <!-- Before/After 对比 -->
+  <rect x="20" y="234" width="320" height="94" rx="6" fill="#fff" stroke="#94a3b8" stroke-width="1"/>
+  <text x="180" y="256" fill="#64748b" font-size="11" font-weight="600" text-anchor="middle">压缩前 messages</text>
+  <rect x="35" y="264" width="52" height="16" rx="3" fill="#e2e8f0"/><text x="40" y="276" fill="#475569" font-size="8">user</text>
+  <rect x="92" y="264" width="52" height="16" rx="3" fill="#e2e8f0"/><text x="97" y="276" fill="#475569" font-size="8">assistant</text>
+  <rect x="149" y="264" width="52" height="16" rx="3" fill="#e2e8f0"/><text x="154" y="276" fill="#475569" font-size="8">user</text>
+  <rect x="206" y="264" width="52" height="16" rx="3" fill="#e2e8f0"/><text x="211" y="276" fill="#475569" font-size="8">assistant</text>
+  <rect x="263" y="264" width="52" height="16" rx="3" fill="#e2e8f0"/><text x="268" y="276" fill="#475569" font-size="8">user</text>
+  <text x="180" y="318" fill="#94a3b8" font-size="9" text-anchor="middle">~180 条消息，占 62K token</text>
+
+  <line x1="345" y1="281" x2="375" y2="281" stroke="#dc2626" stroke-width="2" marker-end="url(#arrow)"/>
+
+  <rect x="380" y="234" width="320" height="94" rx="6" fill="#fef2f2" stroke="#dc2626" stroke-width="1"/>
+  <text x="540" y="256" fill="#991b1b" font-size="11" font-weight="600" text-anchor="middle">压缩后 messages</text>
+  <rect x="395" y="264" width="290" height="32" rx="4" fill="#fee2e2" stroke="#fca5a5" stroke-width="0.5"/>
+  <text x="540" y="276" fill="#991b1b" font-size="9" text-anchor="middle">[Compacted] 摘要：目标 → 创建 hello.py ...</text>
+  <text x="540" y="290" fill="#991b1b" font-size="9" text-anchor="middle">最近文件：hello.py, README.md ...</text>
+  <text x="540" y="318" fill="#94a3b8" font-size="9" text-anchor="middle">~1 条消息，占 1K token</text>
+
+  <!-- 熔断器 -->
+  <rect x="20" y="340" width="680" height="36" rx="6" fill="#f8fafc" stroke="#cbd5e1" stroke-width="1"/>
+  <text x="35" y="362" fill="#475569" font-size="11" font-weight="600">熔断器：</text>
+  <text x="95" y="362" fill="#475569" font-size="10">连续 autocompact 失败 3 次 → 停止重试。防止上下文不可恢复时反复浪费 API 调用。</text>
+</svg>
--- a/s08_context_compact/images/compact-overview.en.svg
+++ b/s08_context_compact/images/compact-overview.en.svg
@@ -0,0 +1,138 @@
+<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 820 520" font-family="system-ui, -apple-system, sans-serif">
+  <defs>
+    <marker id="arrow" viewBox="0 0 10 10" refX="10" refY="5" markerWidth="7" markerHeight="7" orient="auto-start-reverse">
+      <path d="M 0 0 L 10 5 L 0 10 z" fill="#555"/>
+    </marker>
+    <marker id="arrow-blue" viewBox="0 0 10 10" refX="10" refY="5" markerWidth="7" markerHeight="7" orient="auto-start-reverse">
+      <path d="M 0 0 L 10 5 L 0 10 z" fill="#2563eb"/>
+    </marker>
+    <marker id="arrow-amber" viewBox="0 0 10 10" refX="10" refY="5" markerWidth="7" markerHeight="7" orient="auto-start-reverse">
+      <path d="M 0 0 L 10 5 L 0 10 z" fill="#d97706"/>
+    </marker>
+    <marker id="arrow-green" viewBox="0 0 10 10" refX="10" refY="5" markerWidth="7" markerHeight="7" orient="auto-start-reverse">
+      <path d="M 0 0 L 10 5 L 0 10 z" fill="#16a34a"/>
+    </marker>
+    <marker id="arrow-red" viewBox="0 0 10 10" refX="10" refY="5" markerWidth="7" markerHeight="7" orient="auto-start-reverse">
+      <path d="M 0 0 L 10 5 L 0 10 z" fill="#dc2626"/>
+    </marker>
+    <linearGradient id="header" x1="0" y1="0" x2="1" y2="0">
+      <stop offset="0%" stop-color="#1e3a5f"/>
+      <stop offset="100%" stop-color="#2563eb"/>
+    </linearGradient>
+  </defs>
+
+  <!-- Background -->
+  <rect width="820" height="520" fill="#fafbfc" rx="8"/>
+
+  <!-- Title -->
+  <rect x="0" y="0" width="820" height="48" fill="url(#header)" rx="8"/>
+  <rect x="0" y="40" width="820" height="8" fill="url(#header)"/>
+  <text x="410" y="31" fill="#fff" font-size="16" font-weight="700" text-anchor="middle">Context Compact — Compression Before LLM Call, Three Trigger Modes</text>
+
+  <!-- Labels -->
+  <text x="50" y="74" fill="#94a3b8" font-size="11" font-weight="600">s07 Preserved</text>
+  <text x="180" y="74" fill="#d97706" font-size="11" font-weight="600">s08 New</text>
+
+  <!-- ===== ① messages[] ===== -->
+  <rect x="40" y="132" width="100" height="52" rx="8" fill="#f0f4ff" stroke="#2563eb" stroke-width="1.5"/>
+  <text x="90" y="155" fill="#1e3a5f" font-size="12" font-weight="600" text-anchor="middle">messages[]</text>
+  <text x="90" y="172" fill="#64748b" font-size="9" text-anchor="middle">(s07 preserved)</text>
+
+  <!-- messages → pipeline entry -->
+  <line x1="140" y1="158" x2="168" y2="158" stroke="#d97706" stroke-width="2" marker-end="url(#arrow-amber)"/>
+
+  <!-- ===== ② Compression Pipeline ===== -->
+  <rect x="170" y="82" width="200" height="252" rx="10" fill="#fffbeb" stroke="#d97706" stroke-width="2"/>
+  <text x="270" y="102" fill="#92400e" font-size="11" font-weight="700" text-anchor="middle">Compression Pipeline</text>
+
+  <!-- ── ① Every Turn Auto ── -->
+  <rect x="186" y="110" width="168" height="16" rx="3" fill="#fde68a" stroke="#d97706" stroke-width="0.8"/>
+  <text x="270" y="122" fill="#92400e" font-size="8" font-weight="700" text-anchor="middle">① Every Turn · Unconditional · 0 API</text>
+
+  <rect x="186" y="130" width="168" height="24" rx="4" fill="#fef3c7" stroke="#d97706" stroke-width="1"/>
+  <text x="270" y="146" fill="#92400e" font-size="10" font-weight="600" text-anchor="middle">L3 tool_result_budget</text>
+
+  <rect x="186" y="158" width="168" height="24" rx="4" fill="#fef3c7" stroke="#d97706" stroke-width="1"/>
+  <text x="270" y="174" fill="#92400e" font-size="10" font-weight="600" text-anchor="middle">L1 snip_compact</text>
+
+  <rect x="186" y="186" width="168" height="24" rx="4" fill="#fef3c7" stroke="#d97706" stroke-width="1"/>
+  <text x="270" y="202" fill="#92400e" font-size="10" font-weight="600" text-anchor="middle">L2 micro_compact</text>
+
+  <!-- ↓ → ◇ -->
+  <line x1="270" y1="210" x2="270" y2="222" stroke="#555" stroke-width="1.2" marker-end="url(#arrow)"/>
+
+  <!-- ◇ Decision Diamond -->
+  <polygon points="270,226 300,244 270,262 240,244" fill="#f0f4ff" stroke="#ea580c" stroke-width="1.5"/>
+  <text x="270" y="247" fill="#9a3412" font-size="7" font-weight="600" text-anchor="middle">Over threshold?</text>
+
+  <!-- No: right annotation -->
+  <text x="306" y="240" fill="#16a34a" font-size="9" font-weight="700">No → Pass</text>
+  <text x="306" y="252" fill="#94a3b8" font-size="7">Straight to LLM</text>
+
+  <!-- Yes: below annotation -->
+  <text x="284" y="260" fill="#ea580c" font-size="8" font-weight="600">Yes↓</text>
+
+  <!-- ── ② Conditional Trigger ── -->
+  <rect x="186" y="268" width="168" height="16" rx="3" fill="#fed7aa" stroke="#ea580c" stroke-width="0.8"/>
+  <text x="270" y="280" fill="#9a3412" font-size="8" font-weight="700" text-anchor="middle">② Conditional · Token Over Threshold · 1 API</text>
+
+  <rect x="186" y="288" width="168" height="24" rx="4" fill="#fed7aa" stroke="#ea580c" stroke-width="1"/>
+  <text x="270" y="304" fill="#9a3412" font-size="10" font-weight="600" text-anchor="middle">L4 compact_history</text>
+
+  <!-- Pipeline exit → LLM -->
+  <line x1="370" y1="158" x2="438" y2="158" stroke="#2563eb" stroke-width="2" marker-end="url(#arrow-blue)"/>
+
+  <!-- ===== ③ LLM ===== -->
+  <rect x="440" y="132" width="100" height="52" rx="8" fill="#f0f4ff" stroke="#2563eb" stroke-width="1.5"/>
+  <text x="490" y="155" fill="#1e3a5f" font-size="14" font-weight="700" text-anchor="middle">LLM</text>
+  <text x="490" y="172" fill="#64748b" font-size="9" text-anchor="middle">stop_reason=tool_use?</text>
+
+  <!-- LLM No → Return -->
+  <line x1="490" y1="184" x2="490" y2="278" stroke="#16a34a" stroke-width="2" marker-end="url(#arrow-green)"/>
+  <text x="502" y="262" fill="#16a34a" font-size="10" font-weight="600">No</text>
+  <rect x="435" y="280" width="110" height="26" rx="13" fill="#dcfce7" stroke="#16a34a" stroke-width="1.5"/>
+  <text x="490" y="297" fill="#166534" font-size="11" font-weight="600" text-anchor="middle">Return Result</text>
+
+  <!-- LLM Yes → TOOL_HANDLERS -->
+  <line x1="540" y1="158" x2="578" y2="158" stroke="#555" stroke-width="2" marker-end="url(#arrow)"/>
+  <text x="554" y="150" fill="#64748b" font-size="10" font-weight="600">Yes</text>
+
+  <!-- ④ TOOL_HANDLERS -->
+  <rect x="580" y="126" width="130" height="64" rx="8" fill="#f0f4ff" stroke="#2563eb" stroke-width="1.5"/>
+  <text x="645" y="150" fill="#1e3a5f" font-size="10" font-weight="600" text-anchor="middle">TOOL_HANDLERS</text>
+  <text x="645" y="166" fill="#64748b" font-size="9" text-anchor="middle">bash · read · write</text>
+  <text x="645" y="180" fill="#64748b" font-size="9" text-anchor="middle">task · load_skill · ...</text>
+
+  <!-- LLM API error → emergency compact → retry next turn -->
+  <path d="M 535 184 L 570 216 L 580 228" fill="none" stroke="#dc2626" stroke-width="1.5" stroke-dasharray="4,3" marker-end="url(#arrow-red)"/>
+  <text x="552" y="204" fill="#991b1b" font-size="8" font-weight="600">API error</text>
+  <path d="M 665 266 L 665 340 L 160 340 L 160 142 L 186 142" fill="none" stroke="#dc2626" stroke-width="1.5" stroke-dasharray="4,3" marker-end="url(#arrow-red)"/>
+  <text x="530" y="328" fill="#991b1b" font-size="8" font-weight="600">retry to compression pipeline</text>
+
+  <!-- ===== ③ Emergency Trigger (after LLM API failure) ===== -->
+  <rect x="580" y="210" width="170" height="56" rx="6" fill="#fef2f2" stroke="#dc2626" stroke-width="1" stroke-dasharray="4,2"/>
+  <text x="665" y="228" fill="#991b1b" font-size="9" font-weight="700" text-anchor="middle">③ Emergency Trigger</text>
+  <text x="665" y="242" fill="#991b1b" font-size="8" text-anchor="middle">API returns prompt_too_long</text>
+  <text x="665" y="256" fill="#991b1b" font-size="8" text-anchor="middle">→ reactive_compact → retry</text>
+
+  <!-- ===== Loop Back ===== -->
+  <path d="M 710 158 L 760 158 L 760 348 L 90 348 L 90 184" fill="none" stroke="#555" stroke-width="2" marker-end="url(#arrow)" stroke-dasharray="6,3"/>
+  <text x="410" y="366" fill="#64748b" font-size="10" text-anchor="middle">Tool results appended to messages[] → next turn → compress again → LLM</text>
+
+  <!-- ===== Legend ===== -->
+  <rect x="50" y="390" width="720" height="116" rx="6" fill="#f8fafc" stroke="#e2e8f0" stroke-width="1"/>
+
+  <rect x="70" y="404" width="16" height="12" rx="3" fill="#f0f4ff" stroke="#2563eb" stroke-width="1"/>
+  <text x="94" y="414" fill="#334155" font-size="10">s07 Preserved: loop, hooks, skill loading, sub-agents</text>
+
+  <rect x="70" y="426" width="16" height="12" rx="3" fill="#fde68a" stroke="#d97706" stroke-width="1"/>
+  <text x="94" y="436" fill="#334155" font-size="10">① Every Turn Auto: L3→L1→L2 run unconditionally before each LLM call, 0 API</text>
+
+  <rect x="70" y="448" width="16" height="12" rx="3" fill="#fed7aa" stroke="#ea580c" stroke-width="1"/>
+  <text x="94" y="458" fill="#334155" font-size="10">② Conditional: after L3/L1/L2, tokens still over threshold → compact_history, 1 API</text>
+
+  <rect x="70" y="470" width="16" height="12" rx="3" fill="#fef2f2" stroke="#dc2626" stroke-width="1" stroke-dasharray="3,2"/>
+  <text x="94" y="480" fill="#334155" font-size="10">③ Emergency: API returns prompt_too_long → reactive_compact → retry</text>
+
+  <text x="70" y="498" fill="#94a3b8" font-size="9">Three modes with increasing cost: 0 API → 1 API → 1 API + more aggressive trimming</text>
+</svg>
--- a/s08_context_compact/images/compact-overview.ja.svg
+++ b/s08_context_compact/images/compact-overview.ja.svg
@@ -0,0 +1,138 @@
+<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 820 520" font-family="system-ui, -apple-system, sans-serif">
+  <defs>
+    <marker id="arrow" viewBox="0 0 10 10" refX="10" refY="5" markerWidth="7" markerHeight="7" orient="auto-start-reverse">
+      <path d="M 0 0 L 10 5 L 0 10 z" fill="#555"/>
+    </marker>
+    <marker id="arrow-blue" viewBox="0 0 10 10" refX="10" refY="5" markerWidth="7" markerHeight="7" orient="auto-start-reverse">
+      <path d="M 0 0 L 10 5 L 0 10 z" fill="#2563eb"/>
+    </marker>
+    <marker id="arrow-amber" viewBox="0 0 10 10" refX="10" refY="5" markerWidth="7" markerHeight="7" orient="auto-start-reverse">
+      <path d="M 0 0 L 10 5 L 0 10 z" fill="#d97706"/>
+    </marker>
+    <marker id="arrow-green" viewBox="0 0 10 10" refX="10" refY="5" markerWidth="7" markerHeight="7" orient="auto-start-reverse">
+      <path d="M 0 0 L 10 5 L 0 10 z" fill="#16a34a"/>
+    </marker>
+    <marker id="arrow-red" viewBox="0 0 10 10" refX="10" refY="5" markerWidth="7" markerHeight="7" orient="auto-start-reverse">
+      <path d="M 0 0 L 10 5 L 0 10 z" fill="#dc2626"/>
+    </marker>
+    <linearGradient id="header" x1="0" y1="0" x2="1" y2="0">
+      <stop offset="0%" stop-color="#1e3a5f"/>
+      <stop offset="100%" stop-color="#2563eb"/>
+    </linearGradient>
+  </defs>
+
+  <!-- 背景 -->
+  <rect width="820" height="520" fill="#fafbfc" rx="8"/>
+
+  <!-- タイトル -->
+  <rect x="0" y="0" width="820" height="48" fill="url(#header)" rx="8"/>
+  <rect x="0" y="40" width="820" height="8" fill="url(#header)"/>
+  <text x="410" y="31" fill="#fff" font-size="14" font-weight="700" text-anchor="middle">Context Compact — LLM 呼び出し前に圧縮、3 つのトリガーモード</text>
+
+  <!-- ラベル -->
+  <text x="50" y="74" fill="#94a3b8" font-size="11" font-weight="600">s07 保持</text>
+  <text x="180" y="74" fill="#d97706" font-size="11" font-weight="600">s08 新規</text>
+
+  <!-- ===== ① messages[] ===== -->
+  <rect x="40" y="132" width="100" height="52" rx="8" fill="#f0f4ff" stroke="#2563eb" stroke-width="1.5"/>
+  <text x="90" y="155" fill="#1e3a5f" font-size="12" font-weight="600" text-anchor="middle">messages[]</text>
+  <text x="90" y="172" fill="#64748b" font-size="9" text-anchor="middle">(s07 保持)</text>
+
+  <!-- messages → パイプライン入口 -->
+  <line x1="140" y1="158" x2="168" y2="158" stroke="#d97706" stroke-width="2" marker-end="url(#arrow-amber)"/>
+
+  <!-- ===== ② 圧縮パイプライン ===== -->
+  <rect x="170" y="82" width="200" height="252" rx="10" fill="#fffbeb" stroke="#d97706" stroke-width="2"/>
+  <text x="270" y="102" fill="#92400e" font-size="11" font-weight="700" text-anchor="middle">圧縮パイプライン</text>
+
+  <!-- ── ① 毎ターン自動 ── -->
+  <rect x="186" y="110" width="168" height="16" rx="3" fill="#fde68a" stroke="#d97706" stroke-width="0.8"/>
+  <text x="270" y="122" fill="#92400e" font-size="8" font-weight="700" text-anchor="middle">① 毎ターン自動 · 無条件 · 0 API</text>
+
+  <rect x="186" y="130" width="168" height="24" rx="4" fill="#fef3c7" stroke="#d97706" stroke-width="1"/>
+  <text x="270" y="146" fill="#92400e" font-size="10" font-weight="600" text-anchor="middle">L3 tool_result_budget</text>
+
+  <rect x="186" y="158" width="168" height="24" rx="4" fill="#fef3c7" stroke="#d97706" stroke-width="1"/>
+  <text x="270" y="174" fill="#92400e" font-size="10" font-weight="600" text-anchor="middle">L1 snip_compact</text>
+
+  <rect x="186" y="186" width="168" height="24" rx="4" fill="#fef3c7" stroke="#d97706" stroke-width="1"/>
+  <text x="270" y="202" fill="#92400e" font-size="10" font-weight="600" text-anchor="middle">L2 micro_compact</text>
+
+  <!-- ↓ → ◇ -->
+  <line x1="270" y1="210" x2="270" y2="222" stroke="#555" stroke-width="1.2" marker-end="url(#arrow)"/>
+
+  <!-- ◇ 判定ダイヤモンド -->
+  <polygon points="270,226 300,244 270,262 240,244" fill="#f0f4ff" stroke="#ea580c" stroke-width="1.5"/>
+  <text x="270" y="247" fill="#9a3412" font-size="7" font-weight="600" text-anchor="middle">閾値超過?</text>
+
+  <!-- いいえ：右側注釈 -->
+  <text x="306" y="240" fill="#16a34a" font-size="9" font-weight="700">No → 通過</text>
+  <text x="306" y="252" fill="#94a3b8" font-size="7">直接 LLM へ</text>
+
+  <!-- はい：下注釈 -->
+  <text x="284" y="260" fill="#ea580c" font-size="8" font-weight="600">Yes↓</text>
+
+  <!-- ── ② 条件トリガー ── -->
+  <rect x="186" y="268" width="168" height="16" rx="3" fill="#fed7aa" stroke="#ea580c" stroke-width="0.8"/>
+  <text x="270" y="280" fill="#9a3412" font-size="8" font-weight="700" text-anchor="middle">② 条件 · トークン閾値超過 · 1 API</text>
+
+  <rect x="186" y="288" width="168" height="24" rx="4" fill="#fed7aa" stroke="#ea580c" stroke-width="1"/>
+  <text x="270" y="304" fill="#9a3412" font-size="10" font-weight="600" text-anchor="middle">L4 compact_history</text>
+
+  <!-- パイプライン出口 → LLM -->
+  <line x1="370" y1="158" x2="438" y2="158" stroke="#2563eb" stroke-width="2" marker-end="url(#arrow-blue)"/>
+
+  <!-- ===== ③ LLM ===== -->
+  <rect x="440" y="132" width="100" height="52" rx="8" fill="#f0f4ff" stroke="#2563eb" stroke-width="1.5"/>
+  <text x="490" y="155" fill="#1e3a5f" font-size="14" font-weight="700" text-anchor="middle">LLM</text>
+  <text x="490" y="172" fill="#64748b" font-size="9" text-anchor="middle">stop_reason=tool_use?</text>
+
+  <!-- LLM No → 返却 -->
+  <line x1="490" y1="184" x2="490" y2="278" stroke="#16a34a" stroke-width="2" marker-end="url(#arrow-green)"/>
+  <text x="502" y="262" fill="#16a34a" font-size="10" font-weight="600">No</text>
+  <rect x="435" y="280" width="110" height="26" rx="13" fill="#dcfce7" stroke="#16a34a" stroke-width="1.5"/>
+  <text x="490" y="297" fill="#166534" font-size="11" font-weight="600" text-anchor="middle">結果を返す</text>
+
+  <!-- LLM Yes → TOOL_HANDLERS -->
+  <line x1="540" y1="158" x2="578" y2="158" stroke="#555" stroke-width="2" marker-end="url(#arrow)"/>
+  <text x="554" y="150" fill="#64748b" font-size="10" font-weight="600">Yes</text>
+
+  <!-- ④ TOOL_HANDLERS -->
+  <rect x="580" y="126" width="130" height="64" rx="8" fill="#f0f4ff" stroke="#2563eb" stroke-width="1.5"/>
+  <text x="645" y="150" fill="#1e3a5f" font-size="10" font-weight="600" text-anchor="middle">TOOL_HANDLERS</text>
+  <text x="645" y="166" fill="#64748b" font-size="9" text-anchor="middle">bash · read · write</text>
+  <text x="645" y="180" fill="#64748b" font-size="9" text-anchor="middle">task · load_skill · ...</text>
+
+  <!-- LLM API 例外 → 緊急圧縮 → 次ターンで再試行 -->
+  <path d="M 535 184 L 570 216 L 580 228" fill="none" stroke="#dc2626" stroke-width="1.5" stroke-dasharray="4,3" marker-end="url(#arrow-red)"/>
+  <text x="552" y="204" fill="#991b1b" font-size="8" font-weight="600">API 例外</text>
+  <path d="M 665 266 L 665 340 L 160 340 L 160 142 L 186 142" fill="none" stroke="#dc2626" stroke-width="1.5" stroke-dasharray="4,3" marker-end="url(#arrow-red)"/>
+  <text x="530" y="328" fill="#991b1b" font-size="8" font-weight="600">圧縮パイプラインへ再試行</text>
+
+  <!-- ===== ③ 緊急トリガー（LLM API 失敗後） ===== -->
+  <rect x="580" y="210" width="170" height="56" rx="6" fill="#fef2f2" stroke="#dc2626" stroke-width="1" stroke-dasharray="4,2"/>
+  <text x="665" y="228" fill="#991b1b" font-size="9" font-weight="700" text-anchor="middle">③ 緊急トリガー</text>
+  <text x="665" y="242" fill="#991b1b" font-size="8" text-anchor="middle">API が prompt_too_long を返す</text>
+  <text x="665" y="256" fill="#991b1b" font-size="8" text-anchor="middle">→ reactive_compact → リトライ</text>
+
+  <!-- ===== ループバック ===== -->
+  <path d="M 710 158 L 760 158 L 760 348 L 90 348 L 90 184" fill="none" stroke="#555" stroke-width="2" marker-end="url(#arrow)" stroke-dasharray="6,3"/>
+  <text x="410" y="366" fill="#64748b" font-size="10" text-anchor="middle">ツール結果を messages[] に追加 → 次ターン → 再圧縮 → LLM</text>
+
+  <!-- ===== 凡例 ===== -->
+  <rect x="50" y="390" width="720" height="116" rx="6" fill="#f8fafc" stroke="#e2e8f0" stroke-width="1"/>
+
+  <rect x="70" y="404" width="16" height="12" rx="3" fill="#f0f4ff" stroke="#2563eb" stroke-width="1"/>
+  <text x="94" y="414" fill="#334155" font-size="10">s07 保持：ループ、フック、スキルロード、サブエージェント</text>
+
+  <rect x="70" y="426" width="16" height="12" rx="3" fill="#fde68a" stroke="#d97706" stroke-width="1"/>
+  <text x="94" y="436" fill="#334155" font-size="10">① 毎ターン自動：L3→L1→L2 が各 LLM 呼び出し前に無条件実行、0 API</text>
+
+  <rect x="70" y="448" width="16" height="12" rx="3" fill="#fed7aa" stroke="#ea580c" stroke-width="1"/>
+  <text x="94" y="458" fill="#334155" font-size="10">② 条件トリガー：L3/L1/L2 後もトークン超過 → compact_history、1 API</text>
+
+  <rect x="70" y="470" width="16" height="12" rx="3" fill="#fef2f2" stroke="#dc2626" stroke-width="1" stroke-dasharray="3,2"/>
+  <text x="94" y="480" fill="#334155" font-size="10">③ 緊急トリガー：API が prompt_too_long を返す → reactive_compact → リトライ</text>
+
+  <text x="70" y="498" fill="#94a3b8" font-size="9">3 つのモードはコスト増加：0 API → 1 API → 1 API + より積極的なトリム</text>
+</svg>
--- a/s08_context_compact/images/compact-overview.svg
+++ b/s08_context_compact/images/compact-overview.svg
@@ -0,0 +1,138 @@
+<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 820 520" font-family="system-ui, -apple-system, sans-serif">
+  <defs>
+    <marker id="arrow" viewBox="0 0 10 10" refX="10" refY="5" markerWidth="7" markerHeight="7" orient="auto-start-reverse">
+      <path d="M 0 0 L 10 5 L 0 10 z" fill="#555"/>
+    </marker>
+    <marker id="arrow-blue" viewBox="0 0 10 10" refX="10" refY="5" markerWidth="7" markerHeight="7" orient="auto-start-reverse">
+      <path d="M 0 0 L 10 5 L 0 10 z" fill="#2563eb"/>
+    </marker>
+    <marker id="arrow-amber" viewBox="0 0 10 10" refX="10" refY="5" markerWidth="7" markerHeight="7" orient="auto-start-reverse">
+      <path d="M 0 0 L 10 5 L 0 10 z" fill="#d97706"/>
+    </marker>
+    <marker id="arrow-green" viewBox="0 0 10 10" refX="10" refY="5" markerWidth="7" markerHeight="7" orient="auto-start-reverse">
+      <path d="M 0 0 L 10 5 L 0 10 z" fill="#16a34a"/>
+    </marker>
+    <marker id="arrow-red" viewBox="0 0 10 10" refX="10" refY="5" markerWidth="7" markerHeight="7" orient="auto-start-reverse">
+      <path d="M 0 0 L 10 5 L 0 10 z" fill="#dc2626"/>
+    </marker>
+    <linearGradient id="header" x1="0" y1="0" x2="1" y2="0">
+      <stop offset="0%" stop-color="#1e3a5f"/>
+      <stop offset="100%" stop-color="#2563eb"/>
+    </linearGradient>
+  </defs>
+
+  <!-- 背景 -->
+  <rect width="820" height="520" fill="#fafbfc" rx="8"/>
+
+  <!-- 标题 -->
+  <rect x="0" y="0" width="820" height="48" fill="url(#header)" rx="8"/>
+  <rect x="0" y="40" width="820" height="8" fill="url(#header)"/>
+  <text x="410" y="31" fill="#fff" font-size="16" font-weight="700" text-anchor="middle">Context Compact — 压缩插在 LLM 调用前，三种触发模式</text>
+
+  <!-- 标签 -->
+  <text x="50" y="74" fill="#94a3b8" font-size="11" font-weight="600">s07 保留</text>
+  <text x="180" y="74" fill="#d97706" font-size="11" font-weight="600">s08 新增</text>
+
+  <!-- ===== ① messages[] ===== -->
+  <rect x="40" y="132" width="100" height="52" rx="8" fill="#f0f4ff" stroke="#2563eb" stroke-width="1.5"/>
+  <text x="90" y="155" fill="#1e3a5f" font-size="12" font-weight="600" text-anchor="middle">messages[]</text>
+  <text x="90" y="172" fill="#64748b" font-size="9" text-anchor="middle">(s07 保留)</text>
+
+  <!-- messages → 管线入口 -->
+  <line x1="140" y1="158" x2="168" y2="158" stroke="#d97706" stroke-width="2" marker-end="url(#arrow-amber)"/>
+
+  <!-- ===== ② 压缩管线（内部只放标签，不画路径线） ===== -->
+  <rect x="170" y="82" width="200" height="252" rx="10" fill="#fffbeb" stroke="#d97706" stroke-width="2"/>
+  <text x="270" y="102" fill="#92400e" font-size="11" font-weight="700" text-anchor="middle">压缩管线</text>
+
+  <!-- ── ① 每轮自动 ── -->
+  <rect x="186" y="110" width="168" height="16" rx="3" fill="#fde68a" stroke="#d97706" stroke-width="0.8"/>
+  <text x="270" y="122" fill="#92400e" font-size="8" font-weight="700" text-anchor="middle">① 每轮自动 · 无条件 · 0 API</text>
+
+  <rect x="186" y="130" width="168" height="24" rx="4" fill="#fef3c7" stroke="#d97706" stroke-width="1"/>
+  <text x="270" y="146" fill="#92400e" font-size="10" font-weight="600" text-anchor="middle">L3 tool_result_budget</text>
+
+  <rect x="186" y="158" width="168" height="24" rx="4" fill="#fef3c7" stroke="#d97706" stroke-width="1"/>
+  <text x="270" y="174" fill="#92400e" font-size="10" font-weight="600" text-anchor="middle">L1 snip_compact</text>
+
+  <rect x="186" y="186" width="168" height="24" rx="4" fill="#fef3c7" stroke="#d97706" stroke-width="1"/>
+  <text x="270" y="202" fill="#92400e" font-size="10" font-weight="600" text-anchor="middle">L2 micro_compact</text>
+
+  <!-- ↓ → ◇ -->
+  <line x1="270" y1="210" x2="270" y2="222" stroke="#555" stroke-width="1.2" marker-end="url(#arrow)"/>
+
+  <!-- ◇ 判断菱形（紧凑） -->
+  <polygon points="270,226 300,244 270,262 240,244" fill="#f0f4ff" stroke="#ea580c" stroke-width="1.5"/>
+  <text x="270" y="247" fill="#9a3412" font-size="7" font-weight="600" text-anchor="middle">超阈值?</text>
+
+  <!-- 否：右侧文字标注 -->
+  <text x="306" y="240" fill="#16a34a" font-size="9" font-weight="700">否 → 通过</text>
+  <text x="306" y="252" fill="#94a3b8" font-size="7">直接进 LLM</text>
+
+  <!-- 是：下方文字标注 -->
+  <text x="284" y="260" fill="#ea580c" font-size="8" font-weight="600">是↓</text>
+
+  <!-- ── ② 条件触发 ── -->
+  <rect x="186" y="268" width="168" height="16" rx="3" fill="#fed7aa" stroke="#ea580c" stroke-width="0.8"/>
+  <text x="270" y="280" fill="#9a3412" font-size="8" font-weight="700" text-anchor="middle">② 条件触发 · token 超阈值 · 1 API</text>
+
+  <rect x="186" y="288" width="168" height="24" rx="4" fill="#fed7aa" stroke="#ea580c" stroke-width="1"/>
+  <text x="270" y="304" fill="#9a3412" font-size="10" font-weight="600" text-anchor="middle">L4 compact_history</text>
+
+  <!-- 管线出口 → LLM -->
+  <line x1="370" y1="158" x2="438" y2="158" stroke="#2563eb" stroke-width="2" marker-end="url(#arrow-blue)"/>
+
+  <!-- ===== ③ LLM ===== -->
+  <rect x="440" y="132" width="100" height="52" rx="8" fill="#f0f4ff" stroke="#2563eb" stroke-width="1.5"/>
+  <text x="490" y="155" fill="#1e3a5f" font-size="14" font-weight="700" text-anchor="middle">LLM</text>
+  <text x="490" y="172" fill="#64748b" font-size="9" text-anchor="middle">stop_reason=tool_use?</text>
+
+  <!-- LLM 否 → 返回 -->
+  <line x1="490" y1="184" x2="490" y2="278" stroke="#16a34a" stroke-width="2" marker-end="url(#arrow-green)"/>
+  <text x="502" y="262" fill="#16a34a" font-size="10" font-weight="600">否</text>
+  <rect x="435" y="280" width="110" height="26" rx="13" fill="#dcfce7" stroke="#16a34a" stroke-width="1.5"/>
+  <text x="490" y="297" fill="#166534" font-size="11" font-weight="600" text-anchor="middle">返回结果</text>
+
+  <!-- LLM 是 → TOOL_HANDLERS -->
+  <line x1="540" y1="158" x2="578" y2="158" stroke="#555" stroke-width="2" marker-end="url(#arrow)"/>
+  <text x="554" y="150" fill="#64748b" font-size="10" font-weight="600">是</text>
+
+  <!-- ④ TOOL_HANDLERS -->
+  <rect x="580" y="126" width="130" height="64" rx="8" fill="#f0f4ff" stroke="#2563eb" stroke-width="1.5"/>
+  <text x="645" y="150" fill="#1e3a5f" font-size="10" font-weight="600" text-anchor="middle">TOOL_HANDLERS</text>
+  <text x="645" y="166" fill="#64748b" font-size="9" text-anchor="middle">bash · read · write</text>
+  <text x="645" y="180" fill="#64748b" font-size="9" text-anchor="middle">task · load_skill · ...</text>
+
+  <!-- LLM API 异常 → 应急压缩 → 下一轮重试 -->
+  <path d="M 535 184 L 570 216 L 580 228" fill="none" stroke="#dc2626" stroke-width="1.5" stroke-dasharray="4,3" marker-end="url(#arrow-red)"/>
+  <text x="552" y="204" fill="#991b1b" font-size="8" font-weight="600">API 异常</text>
+  <path d="M 665 266 L 665 340 L 160 340 L 160 142 L 186 142" fill="none" stroke="#dc2626" stroke-width="1.5" stroke-dasharray="4,3" marker-end="url(#arrow-red)"/>
+  <text x="530" y="328" fill="#991b1b" font-size="8" font-weight="600">重试回到压缩管线</text>
+
+  <!-- ===== ③ 异常触发（LLM API 调用失败后） ===== -->
+  <rect x="580" y="210" width="170" height="56" rx="6" fill="#fef2f2" stroke="#dc2626" stroke-width="1" stroke-dasharray="4,2"/>
+  <text x="665" y="228" fill="#991b1b" font-size="9" font-weight="700" text-anchor="middle">③ 异常触发</text>
+  <text x="665" y="242" fill="#991b1b" font-size="8" text-anchor="middle">API 返回 prompt_too_long</text>
+  <text x="665" y="256" fill="#991b1b" font-size="8" text-anchor="middle">→ reactive_compact → 重试</text>
+
+  <!-- ===== 回环（y=348 在管线框底 y=334 下方，完全不穿过） ===== -->
+  <path d="M 710 158 L 760 158 L 760 348 L 90 348 L 90 184" fill="none" stroke="#555" stroke-width="2" marker-end="url(#arrow)" stroke-dasharray="6,3"/>
+  <text x="410" y="366" fill="#64748b" font-size="10" text-anchor="middle">工具结果追加到 messages[] → 下一轮 → 再次压缩 → LLM</text>
+
+  <!-- ===== 图例 ===== -->
+  <rect x="50" y="390" width="720" height="116" rx="6" fill="#f8fafc" stroke="#e2e8f0" stroke-width="1"/>
+
+  <rect x="70" y="404" width="16" height="12" rx="3" fill="#f0f4ff" stroke="#2563eb" stroke-width="1"/>
+  <text x="94" y="414" fill="#334155" font-size="10">s07 保留：循环、hook、技能加载、子 Agent</text>
+
+  <rect x="70" y="426" width="16" height="12" rx="3" fill="#fde68a" stroke="#d97706" stroke-width="1"/>
+  <text x="94" y="436" fill="#334155" font-size="10">① 每轮自动：L3→L1→L2 在每次 LLM 调用前无条件执行，0 API</text>
+
+  <rect x="70" y="448" width="16" height="12" rx="3" fill="#fed7aa" stroke="#ea580c" stroke-width="1"/>
+  <text x="94" y="458" fill="#334155" font-size="10">② 条件触发：L3/L1/L2 跑完 token 仍超阈值 → compact_history，1 API</text>
+
+  <rect x="70" y="470" width="16" height="12" rx="3" fill="#fef2f2" stroke="#dc2626" stroke-width="1" stroke-dasharray="3,2"/>
+  <text x="94" y="480" fill="#334155" font-size="10">③ 异常触发：API 返回 prompt_too_long → reactive_compact → 重试</text>
+
+  <text x="70" y="498" fill="#94a3b8" font-size="9">三种模式的代价递增：0 API → 1 API → 1 API + 更激进的裁剪</text>
+</svg>
--- a/s08_context_compact/images/compaction-layers.en.svg
+++ b/s08_context_compact/images/compaction-layers.en.svg
@@ -0,0 +1,98 @@
+<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 760 590" font-family="system-ui, -apple-system, sans-serif">
+  <defs>
+    <linearGradient id="header" x1="0" y1="0" x2="1" y2="0">
+      <stop offset="0%" stop-color="#1e3a5f"/><stop offset="100%" stop-color="#2563eb"/>
+    </linearGradient>
+    <linearGradient id="pre" x1="0" y1="0" x2="0" y2="1">
+      <stop offset="0%" stop-color="#dbeafe"/><stop offset="100%" stop-color="#bfdbfe"/>
+    </linearGradient>
+    <linearGradient id="auto" x1="0" y1="0" x2="0" y2="1">
+      <stop offset="0%" stop-color="#fecaca"/><stop offset="100%" stop-color="#fca5a5"/>
+    </linearGradient>
+    <linearGradient id="emergency" x1="0" y1="0" x2="0" y2="1">
+      <stop offset="0%" stop-color="#fed7aa"/><stop offset="100%" stop-color="#fdba74"/>
+    </linearGradient>
+    <marker id="arrow-d" viewBox="0 0 10 10" refX="5" refY="10" markerWidth="6" markerHeight="6" orient="auto">
+      <path d="M 0 0 L 5 10 L 10 0 z" fill="#94a3b8"/>
+    </marker>
+  </defs>
+
+  <rect width="760" height="590" fill="#fafbfc" rx="8"/>
+
+  <!-- Title bar -->
+  <rect x="0" y="0" width="760" height="44" fill="url(#header)" rx="8"/>
+  <rect x="0" y="36" width="760" height="8" fill="url(#header)"/>
+  <text x="380" y="28" fill="#fff" font-size="15" font-weight="700" text-anchor="middle">Context Compaction — Pre-processing Pipeline + Auto-compact + Emergency Fallback</text>
+
+  <!-- Design principles (left) -->
+  <rect x="20" y="62" width="220" height="80" rx="6" fill="#f8fafc" stroke="#cbd5e1" stroke-width="1"/>
+  <text x="130" y="82" fill="#1e3a5f" font-size="12" font-weight="700" text-anchor="middle">Design Principles</text>
+  <text x="130" y="100" fill="#475569" font-size="10" text-anchor="middle">Cheap operations first, expensive later</text>
+  <text x="130" y="116" fill="#475569" font-size="10" text-anchor="middle">Trim text before dropping messages</text>
+  <text x="130" y="132" fill="#475569" font-size="10" text-anchor="middle">Drop messages before calling LLM</text>
+
+  <!-- Cost escalation (right) -->
+  <rect x="530" y="62" width="210" height="80" rx="6" fill="#f8fafc" stroke="#cbd5e1" stroke-width="1"/>
+  <text x="635" y="82" fill="#1e3a5f" font-size="12" font-weight="700" text-anchor="middle">Increasing Cost</text>
+  <text x="635" y="104" fill="#475569" font-size="10" text-anchor="middle">Text ops → LLM summary → Emergency trim</text>
+  <text x="635" y="124" fill="#94a3b8" font-size="9" text-anchor="middle">0 API · 0 API · 0 API · 1 API · 1 API</text>
+
+  <!-- ===== Pre-processing pipeline title ===== -->
+  <rect x="20" y="146" width="720" height="24" rx="4" fill="#f1f5f9"/>
+  <text x="55" y="163" fill="#64748b" font-size="11" font-weight="600">Pre-processing Pipeline (execution order: L3 → L1 → L2, before every LLM call, 0 API)</text>
+
+  <!-- L3: toolResultBudget -->
+  <rect x="80" y="180" width="600" height="46" rx="7" fill="url(#pre)" stroke="#2563eb" stroke-width="1.5"/>
+  <text x="100" y="200" fill="#1e40af" font-size="12" font-weight="600">L3</text>
+  <text x="135" y="200" fill="#1e40af" font-size="13" font-weight="700">toolResultBudget</text>
+  <text x="260" y="200" fill="#1e40af" font-size="11">tool_result total &gt; 200KB → spill largest item</text>
+  <text x="650" y="200" fill="#1e40af" font-size="10" text-anchor="end">keep full content</text>
+  <text x="135" y="218" fill="#2563eb" font-size="9">Trigger: every turn, before microCompact can replace full content</text>
+
+  <!-- Arrow L3→L1 -->
+  <line x1="380" y1="226" x2="380" y2="238" stroke="#94a3b8" stroke-width="1" marker-end="url(#arrow-d)"/>
+
+  <!-- L1: snipCompact -->
+  <rect x="80" y="240" width="600" height="46" rx="7" fill="url(#pre)" stroke="#2563eb" stroke-width="1.5"/>
+  <text x="100" y="260" fill="#1e40af" font-size="12" font-weight="600">L1</text>
+  <text x="135" y="260" fill="#1e40af" font-size="13" font-weight="700">snipCompact</text>
+  <text x="260" y="260" fill="#1e40af" font-size="11">messages &gt; 50 → trim middle</text>
+  <text x="650" y="260" fill="#1e40af" font-size="10" text-anchor="end">keep head/tail</text>
+  <text x="135" y="278" fill="#2563eb" font-size="9">Trigger: message count exceeds threshold</text>
+
+  <!-- Arrow L1→L2 -->
+  <line x1="380" y1="286" x2="380" y2="298" stroke="#94a3b8" stroke-width="1" marker-end="url(#arrow-d)"/>
+
+  <!-- L2: microCompact -->
+  <rect x="80" y="300" width="600" height="46" rx="7" fill="url(#pre)" stroke="#2563eb" stroke-width="1.5"/>
+  <text x="100" y="320" fill="#1e40af" font-size="12" font-weight="600">L2</text>
+  <text x="135" y="320" fill="#1e40af" font-size="13" font-weight="700">microCompact</text>
+  <text x="260" y="320" fill="#1e40af" font-size="11">old tool_result → placeholder (keep latest 3)</text>
+  <text x="650" y="320" fill="#1e40af" font-size="10" text-anchor="end">compact old</text>
+  <text x="135" y="338" fill="#2563eb" font-size="9">Trigger: every turn automatically; tutorial uses text placeholder</text>
+
+  <!-- ===== Auto-compact title ===== -->
+  <rect x="20" y="358" width="720" height="24" rx="4" fill="#f1f5f9"/>
+  <text x="70" y="375" fill="#64748b" font-size="11" font-weight="600">Auto-compact Decision (triggered when pre-processing is insufficient, 1 API call)</text>
+
+  <!-- L4: autoCompact -->
+  <rect x="80" y="390" width="600" height="58" rx="7" fill="url(#auto)" stroke="#dc2626" stroke-width="2"/>
+  <text x="100" y="412" fill="#991b1b" font-size="12" font-weight="600">L4</text>
+  <text x="135" y="412" fill="#991b1b" font-size="13" font-weight="700">autoCompact</text>
+  <text x="260" y="412" fill="#991b1b" font-size="11">tokens over threshold → LLM summary</text>
+  <text x="650" y="412" fill="#991b1b" font-size="10" text-anchor="end">1 API call</text>
+  <text x="135" y="428" fill="#dc2626" font-size="9">Threshold: contextWindow - maxOutputTokens - 13,000 · Try sessionMemoryCompact first, then LLM</text>
+  <text x="135" y="442" fill="#dc2626" font-size="9">Circuit breaker: stop retrying after 3 consecutive failures</text>
+
+  <!-- ===== Emergency fallback title ===== -->
+  <rect x="20" y="460" width="720" height="24" rx="4" fill="#f1f5f9"/>
+  <text x="55" y="477" fill="#64748b" font-size="11" font-weight="600">Emergency Fallback (triggered when API still returns prompt_too_long)</text>
+
+  <!-- Emergency: reactiveCompact -->
+  <rect x="80" y="492" width="600" height="62" rx="7" fill="url(#emergency)" stroke="#c2410c" stroke-width="1.5"/>
+  <text x="100" y="512" fill="#9a3412" font-size="12" font-weight="600">Emrg</text>
+  <text x="135" y="512" fill="#9a3412" font-size="13" font-weight="700">reactiveCompact</text>
+  <text x="135" y="528" fill="#9a3412" font-size="10">API returns 413 / prompt_too_long → byte-level trim</text>
+  <text x="135" y="544" fill="#c2410c" font-size="9">Keep last 5 + summary; more aggressive than autoCompact</text>
+
+</svg>
--- a/s08_context_compact/images/compaction-layers.ja.svg
+++ b/s08_context_compact/images/compaction-layers.ja.svg
@@ -0,0 +1,98 @@
+<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 760 590" font-family="system-ui, -apple-system, sans-serif">
+  <defs>
+    <linearGradient id="header" x1="0" y1="0" x2="1" y2="0">
+      <stop offset="0%" stop-color="#1e3a5f"/><stop offset="100%" stop-color="#2563eb"/>
+    </linearGradient>
+    <linearGradient id="pre" x1="0" y1="0" x2="0" y2="1">
+      <stop offset="0%" stop-color="#dbeafe"/><stop offset="100%" stop-color="#bfdbfe"/>
+    </linearGradient>
+    <linearGradient id="auto" x1="0" y1="0" x2="0" y2="1">
+      <stop offset="0%" stop-color="#fecaca"/><stop offset="100%" stop-color="#fca5a5"/>
+    </linearGradient>
+    <linearGradient id="emergency" x1="0" y1="0" x2="0" y2="1">
+      <stop offset="0%" stop-color="#fed7aa"/><stop offset="100%" stop-color="#fdba74"/>
+    </linearGradient>
+    <marker id="arrow-d" viewBox="0 0 10 10" refX="5" refY="10" markerWidth="6" markerHeight="6" orient="auto">
+      <path d="M 0 0 L 5 10 L 10 0 z" fill="#94a3b8"/>
+    </marker>
+  </defs>
+
+  <rect width="760" height="590" fill="#fafbfc" rx="8"/>
+
+  <!-- タイトルバー -->
+  <rect x="0" y="0" width="760" height="44" fill="url(#header)" rx="8"/>
+  <rect x="0" y="36" width="760" height="8" fill="url(#header)"/>
+  <text x="380" y="28" fill="#fff" font-size="14" font-weight="700" text-anchor="middle">コンテキスト圧縮 — 前処理パイプライン + 自動圧縮 + 緊急フォールバック</text>
+
+  <!-- 設計原則（左側） -->
+  <rect x="20" y="62" width="220" height="80" rx="6" fill="#f8fafc" stroke="#cbd5e1" stroke-width="1"/>
+  <text x="130" y="82" fill="#1e3a5f" font-size="12" font-weight="700" text-anchor="middle">設計原則</text>
+  <text x="130" y="100" fill="#475569" font-size="10" text-anchor="middle">安価な処理を先に、高価な処理を後に</text>
+  <text x="130" y="116" fill="#475569" font-size="10" text-anchor="middle">テキスト修正 → メッセージ削除の順</text>
+  <text x="130" y="132" fill="#475569" font-size="10" text-anchor="middle">メッセージ削除 → LLM 呼び出しの順</text>
+
+  <!-- コスト増加（右側） -->
+  <rect x="530" y="62" width="210" height="80" rx="6" fill="#f8fafc" stroke="#cbd5e1" stroke-width="1"/>
+  <text x="635" y="82" fill="#1e3a5f" font-size="12" font-weight="700" text-anchor="middle">コスト増加</text>
+  <text x="635" y="104" fill="#475569" font-size="10" text-anchor="middle">テキスト操作 → LLM 要約 → 緊急トリム</text>
+  <text x="635" y="124" fill="#94a3b8" font-size="9" text-anchor="middle">0 API · 0 API · 0 API · 1 API · 1 API</text>
+
+  <!-- ===== 前処理パイプラインタイトル ===== -->
+  <rect x="20" y="146" width="720" height="24" rx="4" fill="#f1f5f9"/>
+  <text x="55" y="163" fill="#64748b" font-size="11" font-weight="600">前処理パイプライン（実行順：L3 → L1 → L2、各 LLM 呼び出し前に自動実行、0 API）</text>
+
+  <!-- L3: toolResultBudget -->
+  <rect x="80" y="180" width="600" height="46" rx="7" fill="url(#pre)" stroke="#2563eb" stroke-width="1.5"/>
+  <text x="100" y="200" fill="#1e40af" font-size="12" font-weight="600">L3</text>
+  <text x="135" y="200" fill="#1e40af" font-size="13" font-weight="700">toolResultBudget</text>
+  <text x="260" y="200" fill="#1e40af" font-size="11">tool_result 合計 &gt; 200KB → 最大項目を退避</text>
+  <text x="650" y="200" fill="#1e40af" font-size="10" text-anchor="end">完全内容を保持</text>
+  <text x="135" y="218" fill="#2563eb" font-size="9">トリガー：毎ターン、microCompact が完全内容を置換する前に実行</text>
+
+  <!-- 矢印 L3→L1 -->
+  <line x1="380" y1="226" x2="380" y2="238" stroke="#94a3b8" stroke-width="1" marker-end="url(#arrow-d)"/>
+
+  <!-- L1: snipCompact -->
+  <rect x="80" y="240" width="600" height="46" rx="7" fill="url(#pre)" stroke="#2563eb" stroke-width="1.5"/>
+  <text x="100" y="260" fill="#1e40af" font-size="12" font-weight="600">L1</text>
+  <text x="135" y="260" fill="#1e40af" font-size="13" font-weight="700">snipCompact</text>
+  <text x="260" y="260" fill="#1e40af" font-size="11">メッセージ &gt; 50 → 中間をトリム</text>
+  <text x="650" y="260" fill="#1e40af" font-size="10" text-anchor="end">先頭/末尾保持</text>
+  <text x="135" y="278" fill="#2563eb" font-size="9">トリガー：メッセージ数が閾値を超過</text>
+
+  <!-- 矢印 L1→L2 -->
+  <line x1="380" y1="286" x2="380" y2="298" stroke="#94a3b8" stroke-width="1" marker-end="url(#arrow-d)"/>
+
+  <!-- L2: microCompact -->
+  <rect x="80" y="300" width="600" height="46" rx="7" fill="url(#pre)" stroke="#2563eb" stroke-width="1.5"/>
+  <text x="100" y="320" fill="#1e40af" font-size="12" font-weight="600">L2</text>
+  <text x="135" y="320" fill="#1e40af" font-size="13" font-weight="700">microCompact</text>
+  <text x="260" y="320" fill="#1e40af" font-size="11">古い tool_result → プレースホルダー（最新 3 件保持）</text>
+  <text x="650" y="320" fill="#1e40af" font-size="10" text-anchor="end">旧結果を圧縮</text>
+  <text x="135" y="338" fill="#2563eb" font-size="9">トリガー：毎ターン自動実行、チュートリアル版はテキストプレースホルダーで模擬</text>
+
+  <!-- ===== 自動圧縮タイトル ===== -->
+  <rect x="20" y="358" width="720" height="24" rx="4" fill="#f1f5f9"/>
+  <text x="70" y="375" fill="#64748b" font-size="11" font-weight="600">自動圧縮判定（前処理で不足時にトリガー、1 API 呼び出し）</text>
+
+  <!-- L4: autoCompact -->
+  <rect x="80" y="390" width="600" height="58" rx="7" fill="url(#auto)" stroke="#dc2626" stroke-width="2"/>
+  <text x="100" y="412" fill="#991b1b" font-size="12" font-weight="600">L4</text>
+  <text x="135" y="412" fill="#991b1b" font-size="13" font-weight="700">autoCompact</text>
+  <text x="260" y="412" fill="#991b1b" font-size="11">トークンが閾値超過 → LLM 全量要約</text>
+  <text x="590" y="412" fill="#991b1b" font-size="10" text-anchor="end">1 API 呼び出し</text>
+  <text x="135" y="428" fill="#dc2626" font-size="9">閾値: contextWindow - maxOutputTokens - 13,000 · sessionMemoryCompact を先に試行、不足時のみ LLM 呼び出し</text>
+  <text x="135" y="442" fill="#dc2626" font-size="9">サーキットブレーカー：連続 3 回失敗後にリトライ停止</text>
+
+  <!-- ===== 緊急フォールバックタイトル ===== -->
+  <rect x="20" y="460" width="720" height="24" rx="4" fill="#f1f5f9"/>
+  <text x="55" y="477" fill="#64748b" font-size="11" font-weight="600">緊急フォールバック（API が引き続き prompt_too_long を返す場合にトリガー）</text>
+
+  <!-- 緊急: reactiveCompact -->
+  <rect x="80" y="492" width="600" height="62" rx="7" fill="url(#emergency)" stroke="#c2410c" stroke-width="1.5"/>
+  <text x="100" y="512" fill="#9a3412" font-size="12" font-weight="600">緊急</text>
+  <text x="135" y="512" fill="#9a3412" font-size="13" font-weight="700">reactiveCompact</text>
+  <text x="135" y="528" fill="#9a3412" font-size="10">API が 413 / prompt_too_long を返す → バイト単位でトリム</text>
+  <text x="135" y="544" fill="#c2410c" font-size="9">最後の 5 件 + 要約を保持、autoCompact より積極的</text>
+
+</svg>
--- a/s08_context_compact/images/compaction-layers.svg
+++ b/s08_context_compact/images/compaction-layers.svg
@@ -0,0 +1,98 @@
+<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 760 590" font-family="system-ui, -apple-system, sans-serif">
+  <defs>
+    <linearGradient id="header" x1="0" y1="0" x2="1" y2="0">
+      <stop offset="0%" stop-color="#1e3a5f"/><stop offset="100%" stop-color="#2563eb"/>
+    </linearGradient>
+    <linearGradient id="pre" x1="0" y1="0" x2="0" y2="1">
+      <stop offset="0%" stop-color="#dbeafe"/><stop offset="100%" stop-color="#bfdbfe"/>
+    </linearGradient>
+    <linearGradient id="auto" x1="0" y1="0" x2="0" y2="1">
+      <stop offset="0%" stop-color="#fecaca"/><stop offset="100%" stop-color="#fca5a5"/>
+    </linearGradient>
+    <linearGradient id="emergency" x1="0" y1="0" x2="0" y2="1">
+      <stop offset="0%" stop-color="#fed7aa"/><stop offset="100%" stop-color="#fdba74"/>
+    </linearGradient>
+    <marker id="arrow-d" viewBox="0 0 10 10" refX="5" refY="10" markerWidth="6" markerHeight="6" orient="auto">
+      <path d="M 0 0 L 5 10 L 10 0 z" fill="#94a3b8"/>
+    </marker>
+  </defs>
+
+  <rect width="760" height="590" fill="#fafbfc" rx="8"/>
+
+  <!-- 标题栏 -->
+  <rect x="0" y="0" width="760" height="44" fill="url(#header)" rx="8"/>
+  <rect x="0" y="36" width="760" height="8" fill="url(#header)"/>
+  <text x="380" y="28" fill="#fff" font-size="15" font-weight="700" text-anchor="middle">上下文压缩 — 预处理管线 + 自动压缩 + 应急兜底</text>
+
+  <!-- 左侧说明 -->
+  <rect x="20" y="62" width="220" height="80" rx="6" fill="#f8fafc" stroke="#cbd5e1" stroke-width="1"/>
+  <text x="130" y="82" fill="#1e3a5f" font-size="12" font-weight="700" text-anchor="middle">设计原则</text>
+  <text x="130" y="100" fill="#475569" font-size="10" text-anchor="middle">便宜的先跑，贵的后跑</text>
+  <text x="130" y="116" fill="#475569" font-size="10" text-anchor="middle">能改文本 → 不删整条</text>
+  <text x="130" y="132" fill="#475569" font-size="10" text-anchor="middle">能删整条 → 不调 LLM</text>
+
+  <!-- 右侧代价箭头 -->
+  <rect x="530" y="62" width="210" height="80" rx="6" fill="#f8fafc" stroke="#cbd5e1" stroke-width="1"/>
+  <text x="635" y="82" fill="#1e3a5f" font-size="12" font-weight="700" text-anchor="middle">代价递增</text>
+  <text x="635" y="104" fill="#475569" font-size="10" text-anchor="middle">文本操作 → LLM 摘要 → 应急裁剪</text>
+  <text x="635" y="124" fill="#94a3b8" font-size="9" text-anchor="middle">0 API · 0 API · 0 API · 1 API · 1 API</text>
+
+  <!-- ===== 预处理管线标题 ===== -->
+  <rect x="20" y="146" width="720" height="24" rx="4" fill="#f1f5f9"/>
+  <text x="55" y="163" fill="#64748b" font-size="11" font-weight="600">预处理管线（执行顺序：L3 → L1 → L2，每轮 LLM 调用前自动执行，0 API）</text>
+
+  <!-- L3: toolResultBudget -->
+  <rect x="80" y="180" width="600" height="46" rx="7" fill="url(#pre)" stroke="#2563eb" stroke-width="1.5"/>
+  <text x="100" y="200" fill="#1e40af" font-size="12" font-weight="600">L3</text>
+  <text x="135" y="200" fill="#1e40af" font-size="13" font-weight="700">toolResultBudget</text>
+  <text x="260" y="200" fill="#1e40af" font-size="11">tool_result 总和 &gt; 200KB → 最大项落盘</text>
+  <text x="650" y="200" fill="#1e40af" font-size="10" text-anchor="end">保留完整内容</text>
+  <text x="135" y="218" fill="#2563eb" font-size="9">触发：每轮自动，必须在 microCompact 之前保留完整内容</text>
+
+  <!-- 箭头 L3→L1 -->
+  <line x1="380" y1="226" x2="380" y2="238" stroke="#94a3b8" stroke-width="1" marker-end="url(#arrow-d)"/>
+
+  <!-- L1: snipCompact -->
+  <rect x="80" y="240" width="600" height="46" rx="7" fill="url(#pre)" stroke="#2563eb" stroke-width="1.5"/>
+  <text x="100" y="260" fill="#1e40af" font-size="12" font-weight="600">L1</text>
+  <text x="135" y="260" fill="#1e40af" font-size="13" font-weight="700">snipCompact</text>
+  <text x="260" y="260" fill="#1e40af" font-size="11">消息 &gt; 50 条 → 裁掉中间</text>
+  <text x="650" y="260" fill="#1e40af" font-size="10" text-anchor="end">保留头尾</text>
+  <text x="135" y="278" fill="#2563eb" font-size="9">触发：消息数超过阈值</text>
+
+  <!-- 箭头 L1→L2 -->
+  <line x1="380" y1="286" x2="380" y2="298" stroke="#94a3b8" stroke-width="1" marker-end="url(#arrow-d)"/>
+
+  <!-- L2: microCompact -->
+  <rect x="80" y="300" width="600" height="46" rx="7" fill="url(#pre)" stroke="#2563eb" stroke-width="1.5"/>
+  <text x="100" y="320" fill="#1e40af" font-size="12" font-weight="600">L2</text>
+  <text x="135" y="320" fill="#1e40af" font-size="13" font-weight="700">microCompact</text>
+  <text x="260" y="320" fill="#1e40af" font-size="11">旧 tool_result → 占位符（保留最近 3 条）</text>
+  <text x="650" y="320" fill="#1e40af" font-size="10" text-anchor="end">压旧结果</text>
+  <text x="135" y="338" fill="#2563eb" font-size="9">触发：每轮自动，教学版用文本占位符模拟</text>
+
+  <!-- ===== 自动压缩标题 ===== -->
+  <rect x="20" y="358" width="720" height="24" rx="4" fill="#f1f5f9"/>
+  <text x="70" y="375" fill="#64748b" font-size="11" font-weight="600">自动压缩决策（预处理不够时触发，1 API 调用）</text>
+
+  <!-- L4: autoCompact -->
+  <rect x="80" y="390" width="600" height="58" rx="7" fill="url(#auto)" stroke="#dc2626" stroke-width="2"/>
+  <text x="100" y="412" fill="#991b1b" font-size="12" font-weight="600">L4</text>
+  <text x="135" y="412" fill="#991b1b" font-size="13" font-weight="700">autoCompact</text>
+  <text x="260" y="412" fill="#991b1b" font-size="11">token 超阈值 → LLM 全量摘要</text>
+  <text x="590" y="412" fill="#991b1b" font-size="10" text-anchor="end">1 API 调用</text>
+  <text x="135" y="428" fill="#dc2626" font-size="9">阈值: contextWindow - maxOutputTokens - 13,000 · 先尝试 sessionMemoryCompact，不够才调 LLM</text>
+  <text x="135" y="442" fill="#dc2626" font-size="9">熔断：连续失败 3 次后停止重试</text>
+
+  <!-- ===== 应急兜底标题 ===== -->
+  <rect x="20" y="460" width="720" height="24" rx="4" fill="#f1f5f9"/>
+  <text x="55" y="477" fill="#64748b" font-size="11" font-weight="600">应急兜底（API 仍然返回 prompt_too_long 时触发）</text>
+
+  <!-- 应急: reactiveCompact -->
+  <rect x="80" y="492" width="600" height="62" rx="7" fill="url(#emergency)" stroke="#c2410c" stroke-width="1.5"/>
+  <text x="100" y="512" fill="#9a3412" font-size="12" font-weight="600">应急</text>
+  <text x="135" y="512" fill="#9a3412" font-size="13" font-weight="700">reactiveCompact</text>
+  <text x="135" y="528" fill="#9a3412" font-size="10">API 返回 413 / prompt_too_long → 字节级裁剪</text>
+  <text x="135" y="544" fill="#c2410c" font-size="9">保留最后 5 条 + 摘要，比 autoCompact 更激进</text>
+
+</svg>
--- a/s08_context_compact/images/layer1-budget.en.svg
+++ b/s08_context_compact/images/layer1-budget.en.svg
@@ -0,0 +1,50 @@
+<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 720 356" font-family="system-ui, -apple-system, sans-serif">
+  <defs>
+    <linearGradient id="header" x1="0" y1="0" x2="1" y2="0">
+      <stop offset="0%" stop-color="#1e3a5f"/><stop offset="100%" stop-color="#2563eb"/>
+    </linearGradient>
+    <marker id="arrow" viewBox="0 0 10 10" refX="10" refY="5" markerWidth="6" markerHeight="6" orient="auto-start-reverse">
+      <path d="M 0 0 L 10 5 L 0 10 z" fill="#16a34a"/>
+    </marker>
+  </defs>
+
+  <rect width="720" height="356" fill="#fafbfc" rx="8"/>
+  <rect x="0" y="0" width="720" height="38" fill="url(#header)" rx="8"/>
+  <rect x="0" y="30" width="720" height="8" fill="url(#header)"/>
+  <text x="360" y="25" fill="#fff" font-size="14" font-weight="700" text-anchor="middle">L3: toolResultBudget — Large Result Persistence</text>
+
+  <!-- Pain Point -->
+  <rect x="20" y="54" width="680" height="42" rx="6" fill="#fef2f2" stroke="#fca5a5" stroke-width="1"/>
+  <text x="35" y="72" fill="#991b1b" font-size="11" font-weight="600">Pain Point</text>
+  <text x="105" y="72" fill="#991b1b" font-size="11">Model read 30 files in one turn; total tool_result adds up to 500KB, filling the entire context window</text>
+
+  <!-- Before -->
+  <text x="155" y="118" fill="#64748b" font-size="12" font-weight="600" text-anchor="middle">Before</text>
+  <rect x="20" y="128" width="270" height="82" rx="6" fill="#fff" stroke="#94a3b8" stroke-width="1"/>
+  <text x="35" y="148" fill="#475569" font-size="10" font-family="monospace">tool_result: (78KB)  ...</text>
+  <text x="35" y="164" fill="#475569" font-size="10" font-family="monospace">tool_result: (142KB) ...</text>
+  <text x="35" y="180" fill="#475569" font-size="10" font-family="monospace">tool_result: (290KB) ...</text>
+  <text x="155" y="202" fill="#ef4444" font-size="9" font-weight="600" text-anchor="middle">Total 510KB → over budget</text>
+
+  <!-- Arrow -->
+  <line x1="295" y1="163" x2="360" y2="163" stroke="#16a34a" stroke-width="2" marker-end="url(#arrow)"/>
+
+  <!-- After -->
+  <text x="485" y="118" fill="#16a34a" font-size="12" font-weight="600" text-anchor="middle">After</text>
+  <rect x="365" y="128" width="335" height="82" rx="6" fill="#f0fdf4" stroke="#16a34a" stroke-width="1"/>
+  <text x="380" y="148" fill="#166534" font-size="10" font-family="monospace">tool_result: &lt;persisted-output&gt;</text>
+  <text x="395" y="164" fill="#166534" font-size="9">Full output: .task_outputs/t1.txt</text>
+  <text x="395" y="178" fill="#166534" font-size="9">Preview: (first 2000 chars) ...</text>
+  <text x="532" y="202" fill="#16a34a" font-size="9" font-weight="600" text-anchor="middle">Total 18KB → normal</text>
+
+  <!-- How it works -->
+  <rect x="20" y="214" width="680" height="64" rx="6" fill="#f8fafc" stroke="#cbd5e1" stroke-width="1"/>
+  <text x="35" y="234" fill="#1e3a5f" font-size="11" font-weight="600">How</text>
+  <text x="70" y="234" fill="#475569" font-size="10">1. Sum the size of all tool_result in the latest turn</text>
+  <text x="70" y="250" fill="#475569" font-size="10">2. Over 200KB → sort by size, persist the largest to .task_outputs/tool-results/</text>
+  <text x="70" y="266" fill="#475569" font-size="10">3. Keep only &lt;persisted-output&gt; marker + first 2000 chars preview in context</text>
+
+  <!-- Result summary -->
+  <rect x="20" y="290" width="680" height="36" rx="6" fill="#f0fdf4" stroke="#16a34a" stroke-width="1"/>
+  <text x="35" y="312" fill="#166534" font-size="11">Result: No data lost (full data on disk), context drops from 510KB to ~18KB, 0 API calls</text>
+</svg>
--- a/s08_context_compact/images/layer1-budget.ja.svg
+++ b/s08_context_compact/images/layer1-budget.ja.svg
@@ -0,0 +1,50 @@
+<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 720 356" font-family="system-ui, -apple-system, sans-serif">
+  <defs>
+    <linearGradient id="header" x1="0" y1="0" x2="1" y2="0">
+      <stop offset="0%" stop-color="#1e3a5f"/><stop offset="100%" stop-color="#2563eb"/>
+    </linearGradient>
+    <marker id="arrow" viewBox="0 0 10 10" refX="10" refY="5" markerWidth="6" markerHeight="6" orient="auto-start-reverse">
+      <path d="M 0 0 L 10 5 L 0 10 z" fill="#16a34a"/>
+    </marker>
+  </defs>
+
+  <rect width="720" height="356" fill="#fafbfc" rx="8"/>
+  <rect x="0" y="0" width="720" height="38" fill="url(#header)" rx="8"/>
+  <rect x="0" y="30" width="720" height="8" fill="url(#header)"/>
+  <text x="360" y="25" fill="#fff" font-size="14" font-weight="700" text-anchor="middle">L3: toolResultBudget — 大結果の永続化</text>
+
+  <!-- ペインポイント -->
+  <rect x="20" y="54" width="680" height="42" rx="6" fill="#fef2f2" stroke="#fca5a5" stroke-width="1"/>
+  <text x="35" y="72" fill="#991b1b" font-size="11" font-weight="600">ペインポイント</text>
+  <text x="100" y="72" fill="#991b1b" font-size="11">モデルが一度に 30 ファイルを読み込み、単一ターンの tool_result が合計 500KB に達し、コンテキストウィンドウを圧迫</text>
+
+  <!-- 圧縮前 -->
+  <text x="155" y="118" fill="#64748b" font-size="12" font-weight="600" text-anchor="middle">圧縮前</text>
+  <rect x="20" y="128" width="270" height="82" rx="6" fill="#fff" stroke="#94a3b8" stroke-width="1"/>
+  <text x="35" y="148" fill="#475569" font-size="10" font-family="monospace">tool_result: (78KB)  ...</text>
+  <text x="35" y="164" fill="#475569" font-size="10" font-family="monospace">tool_result: (142KB) ...</text>
+  <text x="35" y="180" fill="#475569" font-size="10" font-family="monospace">tool_result: (290KB) ...</text>
+  <text x="155" y="202" fill="#ef4444" font-size="9" font-weight="600" text-anchor="middle">合計 510KB → 予算超過</text>
+
+  <!-- 矢印 -->
+  <line x1="295" y1="163" x2="360" y2="163" stroke="#16a34a" stroke-width="2" marker-end="url(#arrow)"/>
+
+  <!-- 圧縮後 -->
+  <text x="485" y="118" fill="#16a34a" font-size="12" font-weight="600" text-anchor="middle">圧縮後</text>
+  <rect x="365" y="128" width="335" height="82" rx="6" fill="#f0fdf4" stroke="#16a34a" stroke-width="1"/>
+  <text x="380" y="148" fill="#166534" font-size="10" font-family="monospace">tool_result: &lt;persisted-output&gt;</text>
+  <text x="395" y="164" fill="#166534" font-size="9">Full output: .task_outputs/t1.txt</text>
+  <text x="395" y="178" fill="#166534" font-size="9">Preview: (先頭 2000 文字) ...</text>
+  <text x="532" y="202" fill="#16a34a" font-size="9" font-weight="600" text-anchor="middle">合計 18KB → 正常</text>
+
+  <!-- 原理説明 -->
+  <rect x="20" y="214" width="680" height="64" rx="6" fill="#f8fafc" stroke="#cbd5e1" stroke-width="1"/>
+  <text x="35" y="234" fill="#1e3a5f" font-size="11" font-weight="600">方法</text>
+  <text x="70" y="234" fill="#475569" font-size="10">1. 最終ターンの全 tool_result の合計サイズを集計</text>
+  <text x="70" y="250" fill="#475569" font-size="10">2. 200KB 超過 → サイズ順にソートし、最大のものから .task_outputs/tool-results/ に永続化</text>
+  <text x="70" y="266" fill="#475569" font-size="10">3. コンテキストには &lt;persisted-output&gt; マーカー + 先頭 2000 文字のプレビューのみ残す</text>
+
+  <!-- 変更サマリー -->
+  <rect x="20" y="290" width="680" height="36" rx="6" fill="#f0fdf4" stroke="#16a34a" stroke-width="1"/>
+  <text x="35" y="312" fill="#166534" font-size="11">結果：情報は失われていない（ディスクに完全なデータあり）、コンテキストは 510KB → ~18KB に削減、0 回 API 呼び出し</text>
+</svg>
--- a/s08_context_compact/images/layer1-budget.svg
+++ b/s08_context_compact/images/layer1-budget.svg
@@ -0,0 +1,50 @@
+<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 720 356" font-family="system-ui, -apple-system, sans-serif">
+  <defs>
+    <linearGradient id="header" x1="0" y1="0" x2="1" y2="0">
+      <stop offset="0%" stop-color="#1e3a5f"/><stop offset="100%" stop-color="#2563eb"/>
+    </linearGradient>
+    <marker id="arrow" viewBox="0 0 10 10" refX="10" refY="5" markerWidth="6" markerHeight="6" orient="auto-start-reverse">
+      <path d="M 0 0 L 10 5 L 0 10 z" fill="#16a34a"/>
+    </marker>
+  </defs>
+
+  <rect width="720" height="356" fill="#fafbfc" rx="8"/>
+  <rect x="0" y="0" width="720" height="38" fill="url(#header)" rx="8"/>
+  <rect x="0" y="30" width="720" height="8" fill="url(#header)"/>
+  <text x="360" y="25" fill="#fff" font-size="14" font-weight="700" text-anchor="middle">L3: toolResultBudget — 大结果落盘</text>
+
+  <!-- 痛点 -->
+  <rect x="20" y="54" width="680" height="42" rx="6" fill="#fef2f2" stroke="#fca5a5" stroke-width="1"/>
+  <text x="35" y="72" fill="#991b1b" font-size="11" font-weight="600">痛点</text>
+  <text x="75" y="72" fill="#991b1b" font-size="11">模型一次读了 30 个文件，单轮 tool_result 加起来 500KB，直接把上下文窗口打满</text>
+
+  <!-- Before -->
+  <text x="155" y="118" fill="#64748b" font-size="12" font-weight="600" text-anchor="middle">压缩前</text>
+  <rect x="20" y="128" width="270" height="82" rx="6" fill="#fff" stroke="#94a3b8" stroke-width="1"/>
+  <text x="35" y="148" fill="#475569" font-size="10" font-family="monospace">tool_result: (78KB)  ...</text>
+  <text x="35" y="164" fill="#475569" font-size="10" font-family="monospace">tool_result: (142KB) ...</text>
+  <text x="35" y="180" fill="#475569" font-size="10" font-family="monospace">tool_result: (290KB) ...</text>
+  <text x="155" y="202" fill="#ef4444" font-size="9" font-weight="600" text-anchor="middle">合计 510KB → 超预算</text>
+
+  <!-- Arrow -->
+  <line x1="295" y1="163" x2="360" y2="163" stroke="#16a34a" stroke-width="2" marker-end="url(#arrow)"/>
+
+  <!-- After -->
+  <text x="485" y="118" fill="#16a34a" font-size="12" font-weight="600" text-anchor="middle">压缩后</text>
+  <rect x="365" y="128" width="335" height="82" rx="6" fill="#f0fdf4" stroke="#16a34a" stroke-width="1"/>
+  <text x="380" y="148" fill="#166534" font-size="10" font-family="monospace">tool_result: &lt;persisted-output&gt;</text>
+  <text x="395" y="164" fill="#166534" font-size="9">Full output: .task_outputs/t1.txt</text>
+  <text x="395" y="178" fill="#166534" font-size="9">Preview: (前 2000 字符) ...</text>
+  <text x="532" y="202" fill="#16a34a" font-size="9" font-weight="600" text-anchor="middle">合计 18KB → 正常</text>
+
+  <!-- 原理说明 -->
+  <rect x="20" y="214" width="680" height="64" rx="6" fill="#f8fafc" stroke="#cbd5e1" stroke-width="1"/>
+  <text x="35" y="234" fill="#1e3a5f" font-size="11" font-weight="600">怎么做</text>
+  <text x="85" y="234" fill="#475569" font-size="10">1. 统计最后一轮所有 tool_result 的总大小</text>
+  <text x="85" y="250" fill="#475569" font-size="10">2. 超过 200KB → 按大小排序，从最大的开始落盘到 .task_outputs/tool-results/</text>
+  <text x="85" y="266" fill="#475569" font-size="10">3. 上下文里只留 &lt;persisted-output&gt; 标记 + 前 2000 字符预览</text>
+
+  <!-- 变化摘要 -->
+  <rect x="20" y="290" width="680" height="36" rx="6" fill="#f0fdf4" stroke="#16a34a" stroke-width="1"/>
+  <text x="35" y="312" fill="#166534" font-size="11">结果：信息没丢（磁盘有完整数据），上下文从 510KB 降到 ~18KB，0 次 API 调用</text>
+</svg>
--- a/s08_context_compact/images/micro-compact.en.svg
+++ b/s08_context_compact/images/micro-compact.en.svg
@@ -0,0 +1,57 @@
+<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 720 300" font-family="system-ui, -apple-system, sans-serif">
+  <defs>
+    <linearGradient id="header" x1="0" y1="0" x2="1" y2="0">
+      <stop offset="0%" stop-color="#1e3a5f"/><stop offset="100%" stop-color="#2563eb"/>
+    </linearGradient>
+    <marker id="arrow" viewBox="0 0 10 10" refX="10" refY="5" markerWidth="6" markerHeight="6" orient="auto-start-reverse">
+      <path d="M 0 0 L 10 5 L 0 10 z" fill="#ca8a04"/>
+    </marker>
+  </defs>
+
+  <rect width="720" height="300" fill="#fafbfc" rx="8"/>
+  <rect x="0" y="0" width="720" height="38" fill="url(#header)" rx="8"/>
+  <rect x="0" y="30" width="720" height="8" fill="url(#header)"/>
+  <text x="360" y="25" fill="#fff" font-size="14" font-weight="700" text-anchor="middle">L2: microCompact — Old Result Placeholder Replacement</text>
+
+  <!-- Pain Point -->
+  <rect x="20" y="54" width="680" height="36" rx="6" fill="#fef2f2" stroke="#fca5a5" stroke-width="1"/>
+  <text x="35" y="70" fill="#991b1b" font-size="11" font-weight="600">Pain Point</text>
+  <text x="110" y="70" fill="#991b1b" font-size="11">Agent read 10 files in a row; the full content of reads 1-7 is still sitting in context, taking space but no longer useful</text>
+
+  <!-- Before -->
+  <text x="155" y="114" fill="#64748b" font-size="12" font-weight="600" text-anchor="middle">Before (all 10 tool_result complete)</text>
+  <rect x="20" y="122" width="310" height="95" rx="6" fill="#fff" stroke="#94a3b8" stroke-width="1"/>
+  <rect x="30" y="130" width="290" height="10" rx="2" fill="#e2e8f0"/>
+  <text x="38" y="138" fill="#94a3b8" font-size="8" font-family="monospace">Read file A: (full content, 3200 chars)...</text>
+  <rect x="30" y="145" width="290" height="10" rx="2" fill="#e2e8f0"/>
+  <text x="38" y="153" fill="#94a3b8" font-size="8" font-family="monospace">Read file B: (full content, 1800 chars)...</text>
+  <rect x="30" y="160" width="290" height="10" rx="2" fill="#e2e8f0"/>
+  <text x="38" y="168" fill="#94a3b8" font-size="8" font-family="monospace">Read file C: (full content, 4500 chars)...</text>
+  <rect x="30" y="175" width="290" height="10" rx="2" fill="#fef3c7"/>
+  <text x="38" y="183" fill="#92400e" font-size="8" font-family="monospace">Read file J: (full content, 2800 chars)</text>
+  <text x="175" y="212" fill="#ef4444" font-size="9" font-weight="600">7 old results waste ~25K chars</text>
+
+  <!-- Arrow -->
+  <line x1="335" y1="170" x2="375" y2="170" stroke="#ca8a04" stroke-width="2" marker-end="url(#arrow)"/>
+
+  <!-- After -->
+  <text x="535" y="114" fill="#ca8a04" font-size="12" font-weight="600" text-anchor="middle">After (keep only latest 3 complete)</text>
+  <rect x="390" y="122" width="310" height="95" rx="6" fill="#fefce8" stroke="#ca8a04" stroke-width="1"/>
+  <rect x="400" y="130" width="290" height="10" rx="2" fill="#fef3c7"/>
+  <text x="408" y="138" fill="#92400e" font-size="8" font-family="monospace">[Earlier result compacted. Re-run if needed.]</text>
+  <rect x="400" y="145" width="290" height="10" rx="2" fill="#fef3c7"/>
+  <text x="408" y="153" fill="#92400e" font-size="8" font-family="monospace">[Earlier result compacted. Re-run if needed.]</text>
+  <rect x="400" y="160" width="290" height="10" rx="2" fill="#fef3c7"/>
+  <text x="408" y="168" fill="#92400e" font-size="8" font-family="monospace">[Earlier result compacted. Re-run if needed.]</text>
+  <rect x="400" y="175" width="290" height="10" rx="2" fill="#fef3c7"/>
+  <text x="408" y="183" fill="#92400e" font-size="8" font-family="monospace">Read file J: (full content, 2800 chars)</text>
+  <text x="545" y="212" fill="#ca8a04" font-size="9" font-weight="600">Keep only latest 3; first 7 become placeholders</text>
+
+  <!-- How -->
+  <rect x="20" y="228" width="680" height="62" rx="6" fill="#f8fafc" stroke="#cbd5e1" stroke-width="1"/>
+  <text x="35" y="248" fill="#1e3a5f" font-size="11" font-weight="600">How (teaching version)</text>
+  <text x="155" y="248" fill="#475569" font-size="10">Iterate through tool_result, keep only latest 3 complete, replace older ones with placeholders.</text>
+  <text x="35" y="264" fill="#1e3a5f" font-size="11" font-weight="600">Real CC</text>
+  <text x="95" y="264" fill="#475569" font-size="10">Clears old results via API cache_edits (without breaking prompt cache prefix), only for COMPACTABLE_TOOLS:</text>
+  <text x="95" y="280" fill="#94a3b8" font-size="9">Read, Bash, Grep, Glob, WebSearch, WebFetch, Edit, Write. Teaching version uses text placeholders to simulate the same effect.</text>
+</svg>
--- a/s08_context_compact/images/micro-compact.ja.svg
+++ b/s08_context_compact/images/micro-compact.ja.svg
@@ -0,0 +1,57 @@
+<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 720 300" font-family="system-ui, -apple-system, sans-serif">
+  <defs>
+    <linearGradient id="header" x1="0" y1="0" x2="1" y2="0">
+      <stop offset="0%" stop-color="#1e3a5f"/><stop offset="100%" stop-color="#2563eb"/>
+    </linearGradient>
+    <marker id="arrow" viewBox="0 0 10 10" refX="10" refY="5" markerWidth="6" markerHeight="6" orient="auto-start-reverse">
+      <path d="M 0 0 L 10 5 L 0 10 z" fill="#ca8a04"/>
+    </marker>
+  </defs>
+
+  <rect width="720" height="300" fill="#fafbfc" rx="8"/>
+  <rect x="0" y="0" width="720" height="38" fill="url(#header)" rx="8"/>
+  <rect x="0" y="30" width="720" height="8" fill="url(#header)"/>
+  <text x="360" y="25" fill="#fff" font-size="14" font-weight="700" text-anchor="middle">L2: microCompact — 旧結果のプレースホルダー置換</text>
+
+  <!-- ペインポイント -->
+  <rect x="20" y="54" width="680" height="36" rx="6" fill="#fef2f2" stroke="#fca5a5" stroke-width="1"/>
+  <text x="35" y="70" fill="#991b1b" font-size="11" font-weight="600">ペインポイント</text>
+  <text x="115" y="70" fill="#991b1b" font-size="11">Agent が連続で 10 ファイルを読み込み、1〜7 回目の完全なファイル内容がコンテキストに残ったまま、場所を占有しつつ既に不要</text>
+
+  <!-- 圧縮前 -->
+  <text x="155" y="114" fill="#64748b" font-size="12" font-weight="600" text-anchor="middle">圧縮前（10 件の tool_result がすべて完全）</text>
+  <rect x="20" y="122" width="310" height="95" rx="6" fill="#fff" stroke="#94a3b8" stroke-width="1"/>
+  <rect x="30" y="130" width="290" height="10" rx="2" fill="#e2e8f0"/>
+  <text x="38" y="138" fill="#94a3b8" font-size="8" font-family="monospace">Read file A: (完全な内容, 3200 文字)...</text>
+  <rect x="30" y="145" width="290" height="10" rx="2" fill="#e2e8f0"/>
+  <text x="38" y="153" fill="#94a3b8" font-size="8" font-family="monospace">Read file B: (完全な内容, 1800 文字)...</text>
+  <rect x="30" y="160" width="290" height="10" rx="2" fill="#e2e8f0"/>
+  <text x="38" y="168" fill="#94a3b8" font-size="8" font-family="monospace">Read file C: (完全な内容, 4500 文字)...</text>
+  <rect x="30" y="175" width="290" height="10" rx="2" fill="#fef3c7"/>
+  <text x="38" y="183" fill="#92400e" font-size="8" font-family="monospace">Read file J: (完全な内容, 2800 文字)</text>
+  <text x="175" y="212" fill="#ef4444" font-size="9" font-weight="600">7 件の旧結果が ~25K 文字を無駄に占有</text>
+
+  <!-- 矢印 -->
+  <line x1="335" y1="170" x2="375" y2="170" stroke="#ca8a04" stroke-width="2" marker-end="url(#arrow)"/>
+
+  <!-- 圧縮後 -->
+  <text x="535" y="114" fill="#ca8a04" font-size="12" font-weight="600" text-anchor="middle">圧縮後（最新 3 件のみ完全保持）</text>
+  <rect x="390" y="122" width="310" height="95" rx="6" fill="#fefce8" stroke="#ca8a04" stroke-width="1"/>
+  <rect x="400" y="130" width="290" height="10" rx="2" fill="#fef3c7"/>
+  <text x="408" y="138" fill="#92400e" font-size="8" font-family="monospace">[Earlier result compacted. Re-run if needed.]</text>
+  <rect x="400" y="145" width="290" height="10" rx="2" fill="#fef3c7"/>
+  <text x="408" y="153" fill="#92400e" font-size="8" font-family="monospace">[Earlier result compacted. Re-run if needed.]</text>
+  <rect x="400" y="160" width="290" height="10" rx="2" fill="#fef3c7"/>
+  <text x="408" y="168" fill="#92400e" font-size="8" font-family="monospace">[Earlier result compacted. Re-run if needed.]</text>
+  <rect x="400" y="175" width="290" height="10" rx="2" fill="#fef3c7"/>
+  <text x="408" y="183" fill="#92400e" font-size="8" font-family="monospace">Read file J: (完全な内容, 2800 文字)</text>
+  <text x="545" y="212" fill="#ca8a04" font-size="9" font-weight="600">最新 3 件のみ保持、前 7 件はプレースホルダー化</text>
+
+  <!-- 原理 -->
+  <rect x="20" y="228" width="680" height="62" rx="6" fill="#f8fafc" stroke="#cbd5e1" stroke-width="1"/>
+  <text x="35" y="248" fill="#1e3a5f" font-size="11" font-weight="600">方法（教学版）</text>
+  <text x="130" y="248" fill="#475569" font-size="10">tool_result を走査し、最新 3 件のみ完全保持、古いものはプレースホルダーに置換。</text>
+  <text x="35" y="264" fill="#1e3a5f" font-size="11" font-weight="600">実際の CC</text>
+  <text x="110" y="264" fill="#475569" font-size="10">API cache_edits で旧結果をクリア（prompt cache プレフィックスを破壊しない）、COMPACTABLE_TOOLS のみ対象：</text>
+  <text x="110" y="280" fill="#94a3b8" font-size="9">Read, Bash, Grep, Glob, WebSearch, WebFetch, Edit, Write。教学版はテキストプレースホルダーで同様の効果を模擬。</text>
+</svg>
--- a/s08_context_compact/images/micro-compact.svg
+++ b/s08_context_compact/images/micro-compact.svg
@@ -0,0 +1,57 @@
+<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 720 300" font-family="system-ui, -apple-system, sans-serif">
+  <defs>
+    <linearGradient id="header" x1="0" y1="0" x2="1" y2="0">
+      <stop offset="0%" stop-color="#1e3a5f"/><stop offset="100%" stop-color="#2563eb"/>
+    </linearGradient>
+    <marker id="arrow" viewBox="0 0 10 10" refX="10" refY="5" markerWidth="6" markerHeight="6" orient="auto-start-reverse">
+      <path d="M 0 0 L 10 5 L 0 10 z" fill="#ca8a04"/>
+    </marker>
+  </defs>
+
+  <rect width="720" height="300" fill="#fafbfc" rx="8"/>
+  <rect x="0" y="0" width="720" height="38" fill="url(#header)" rx="8"/>
+  <rect x="0" y="30" width="720" height="8" fill="url(#header)"/>
+  <text x="360" y="25" fill="#fff" font-size="14" font-weight="700" text-anchor="middle">L2: microCompact — 旧结果占位替换</text>
+
+  <!-- 痛点 -->
+  <rect x="20" y="54" width="680" height="36" rx="6" fill="#fef2f2" stroke="#fca5a5" stroke-width="1"/>
+  <text x="35" y="70" fill="#991b1b" font-size="11" font-weight="600">痛点</text>
+  <text x="75" y="70" fill="#991b1b" font-size="11">Agent 连续读了 10 个文件，第 1-7 次的完整文件内容还躺在上下文里，占着位置但早就没用了</text>
+
+  <!-- Before -->
+  <text x="155" y="114" fill="#64748b" font-size="12" font-weight="600" text-anchor="middle">压缩前（10 条 tool_result 全部完整）</text>
+  <rect x="20" y="122" width="310" height="95" rx="6" fill="#fff" stroke="#94a3b8" stroke-width="1"/>
+  <rect x="30" y="130" width="290" height="10" rx="2" fill="#e2e8f0"/>
+  <text x="38" y="138" fill="#94a3b8" font-size="8" font-family="monospace">Read file A: (完整内容, 3200 字符)...</text>
+  <rect x="30" y="145" width="290" height="10" rx="2" fill="#e2e8f0"/>
+  <text x="38" y="153" fill="#94a3b8" font-size="8" font-family="monospace">Read file B: (完整内容, 1800 字符)...</text>
+  <rect x="30" y="160" width="290" height="10" rx="2" fill="#e2e8f0"/>
+  <text x="38" y="168" fill="#94a3b8" font-size="8" font-family="monospace">Read file C: (完整内容, 4500 字符)...</text>
+  <rect x="30" y="175" width="290" height="10" rx="2" fill="#fef3c7"/>
+  <text x="38" y="183" fill="#92400e" font-size="8" font-family="monospace">Read file J: (完整内容, 2800 字符)</text>
+  <text x="175" y="212" fill="#ef4444" font-size="9" font-weight="600">7 条旧结果白占 ~25K 字符</text>
+
+  <!-- Arrow -->
+  <line x1="335" y1="170" x2="375" y2="170" stroke="#ca8a04" stroke-width="2" marker-end="url(#arrow)"/>
+
+  <!-- After -->
+  <text x="535" y="114" fill="#ca8a04" font-size="12" font-weight="600" text-anchor="middle">压缩后（只保留最近 3 条完整）</text>
+  <rect x="390" y="122" width="310" height="95" rx="6" fill="#fefce8" stroke="#ca8a04" stroke-width="1"/>
+  <rect x="400" y="130" width="290" height="10" rx="2" fill="#fef3c7"/>
+  <text x="408" y="138" fill="#92400e" font-size="8" font-family="monospace">[Earlier result compacted. Re-run if needed.]</text>
+  <rect x="400" y="145" width="290" height="10" rx="2" fill="#fef3c7"/>
+  <text x="408" y="153" fill="#92400e" font-size="8" font-family="monospace">[Earlier result compacted. Re-run if needed.]</text>
+  <rect x="400" y="160" width="290" height="10" rx="2" fill="#fef3c7"/>
+  <text x="408" y="168" fill="#92400e" font-size="8" font-family="monospace">[Earlier result compacted. Re-run if needed.]</text>
+  <rect x="400" y="175" width="290" height="10" rx="2" fill="#fef3c7"/>
+  <text x="408" y="183" fill="#92400e" font-size="8" font-family="monospace">Read file J: (完整内容, 2800 字符)</text>
+  <text x="545" y="212" fill="#ca8a04" font-size="9" font-weight="600">只保留最近 3 条，前 7 条变占位</text>
+
+  <!-- 原理 -->
+  <rect x="20" y="228" width="680" height="62" rx="6" fill="#f8fafc" stroke="#cbd5e1" stroke-width="1"/>
+  <text x="35" y="248" fill="#1e3a5f" font-size="11" font-weight="600">怎么做（教学版）</text>
+  <text x="115" y="248" fill="#475569" font-size="10">遍历 tool_result，只保留最近 3 条完整，更旧的替换为占位符。</text>
+  <text x="35" y="264" fill="#1e3a5f" font-size="11" font-weight="600">真实 CC</text>
+  <text x="95" y="264" fill="#475569" font-size="10">通过 API cache_edits 清除旧结果（不破坏 prompt cache 前缀），仅对 COMPACTABLE_TOOLS 生效：</text>
+  <text x="95" y="280" fill="#94a3b8" font-size="9">Read, Bash, Grep, Glob, WebSearch, WebFetch, Edit, Write。教学版用文本占位模拟同样效果。</text>
+</svg>