Files
analysis_claude_code/s07_skill_loading
gui-yue 1baf1aca5a Follow up PR #265: refine chapters, diagrams, and add S20 (#283)
* feat: s01-s14 docs quality overhaul — tool pipeline, single-agent, knowledge & resilience

Rewrite code.py and README (zh/en/ja) for s01-s14, each chapter building
incrementally on the previous. Key fixes across chapters:

- s01-s04: agent loop, tool dispatch, permission pipeline, hooks
- s05-s08: todo write, subagent, skill loading, context compact
- s09-s11: memory system, system prompt assembly, error recovery
- s12-s14: task graph, background tasks, cron scheduler

All chapters CC source-verified. Code inherits fixes forward (PROMPT_SECTIONS,
json.dumps cache, real-state context, can_start dep protection, etc.).

* feat: s15-s19 docs quality overhaul — multi-agent platform: teams, protocols, autonomy, worktree, MCP tools

Rewrite code.py and README (zh/en/ja) for s15-s19, the multi-agent platform
chapters. Each chapter inherits all previous fixes and adds one mechanism:

- s15: agent teams (TeamCreate, teammate threads, shared task list)
- s16: team protocols (plan approval, shutdown handshake, consume_inbox)
- s17: autonomous agents (idle polling, auto-claim, consume_lead_inbox)
- s18: worktree isolation (git worktree, bind_task, cwd switching, safety)
- s19: MCP tools (MCPClient, normalize_mcp_name, assemble_tool_pool, no cache)

All appendix source code references verified against CC source. Config priority
corrected: claude.ai < plugin < user < project < local.

* fix: 5 regressions across s05-s19 — glob safety, todo validation, memory extraction, protocol types, dep crash

- s05-s09: glob results now filter with is_relative_to(WORKDIR) (inherited from s02)
- s06-s08: todo_write validates content/status required fields (inherited from s05)
- s09: extract_memories uses pre-compression snapshot instead of compacted messages
- s16: submit_plan docstring clarifies protocol-only (not code-level gate)
- s17-s19: match_response restores type mismatch validation (from s16)
- s17-s19: claim_task deps list handles missing dep files without crashing

* fix: s12 Todo V2 logic reversal, s14/s15 cron range validation, s18/s19 worktree name validation

- s12 README (zh/en/ja): fix Todo V2 direction — interactive defaults to Task,
  non-interactive/SDK defaults to TodoWrite. Fix env var name to
  CLAUDE_CODE_ENABLE_TASKS (not TODO_V2).
- s14/s15: add _validate_cron_field with per-field range checks (minute 0-59,
  hour 0-23, dom 1-31, month 1-12, dow 0-6), step > 0, range lo <= hi.
  Replace old try/except validation that only caught exceptions.
- s18/s19: add validate_worktree_name() to remove_worktree and keep_worktree,
  not just create_worktree.

* fix: align s16-s19 teaching tool consistency

* fix pr265 chapter diagrams

* Add comprehensive s20 harness chapter

* Fix chapter smoke test regressions

* Clarify README tutorial track transition

---------

Co-authored-by: Haoran <bill-billion@outlook.com>
2026-05-20 21:45:38 +08:00
..

s07: Skill Loading — Load Only When Needed

中文 · English · 日本語

s01 → s02 → s03 → s04 → s05 → s06 → s07s08 → s09 → ... → s20

"Load when needed, don't stuff the prompt" — Inject via tool_result, not system prompt.

Harness Layer: Knowledge — load on demand, don't fill the context.


The Problem

Your project has a React component spec, a SQL style guide, and an API design doc. You want the Agent to follow these specs automatically. The most straightforward idea — stuff them all into the system prompt:

SYSTEM = (
    f"You are a coding agent. "
    + open("docs/react-style.md").read()       # 2000 lines
    + open("docs/sql-style.md").read()         # 1500 lines
    + open("docs/api-design.md").read()        # 3000 lines
)

6500 lines of system prompt. The Agent carries these docs on every LLM call — whether it's changing a CSS color or fixing a SQL query. 99% of the content is irrelevant to the current task, burning tokens for nothing.


The Solution

Skill Overview

The minimal hook structure, todo_write, and sub-Agent from the previous chapter are preserved. This chapter focuses on the new load_skill tool. At startup, inject the skill catalog into the SYSTEM prompt; at runtime, register one more tool to load full content, spending tokens only when used.

Two-level design:

Level Location Timing Cost
1. Catalog system prompt Injected at startup (harness scans skills/) ~100 tokens/skill, carried every turn
2. Content tool_result When Agent calls load_skill ~2000 tokens/skill, on demand

The dispatch mechanism is unchanged, load_skill auto-dispatches via TOOL_HANDLERS[block.name].


How It Works

skills/ directory, one subdirectory per skill, each containing a SKILL.md file:

skills/
  agent-builder/SKILL.md
  code-review/SKILL.md
  mcp-builder/SKILL.md
  pdf/SKILL.md

Level 1: Inject catalog at startup: the harness calls _scan_skills() at startup to scan the skills/ directory, parsing each SKILL.md's YAML frontmatter (name, description) into a SKILL_REGISTRY dictionary. list_skills() generates the catalog from the registry, injected into the SYSTEM prompt. The Agent sees "which skills I have available" every turn, with no extra API calls:

SKILL_REGISTRY: dict[str, dict] = {}

def _scan_skills():
    if not SKILLS_DIR.exists():
        return
    for d in sorted(SKILLS_DIR.iterdir()):
        if not d.is_dir():
            continue
        manifest = d / "SKILL.md"
        if manifest.exists():
            raw = manifest.read_text()
            meta, body = _parse_frontmatter(raw)
            name = meta.get("name", d.name)
            desc = meta.get("description", raw.split("\n")[0].lstrip("#").strip())
            SKILL_REGISTRY[name] = {"name": name, "description": desc, "content": raw}

_scan_skills()  # runs once at startup

def list_skills() -> str:
    return "\n".join(f"- **{s['name']}**: {s['description']}" for s in SKILL_REGISTRY.values())

def build_system() -> str:
    catalog = list_skills()
    return (
        f"You are a coding agent at {WORKDIR}. "
        f"Skills available:\n{catalog}\n"
        "Use load_skill to get full details when needed."
    )

SYSTEM = build_system()

Level 2: load_skill: the Agent decides "I need the SQL style guide" and calls load_skill("sql-style"). Lookup goes through the registry, not file paths, eliminating path traversal risk. The content is injected via tool_result:

def load_skill(name: str) -> str:
    skill = SKILL_REGISTRY.get(name)
    if not skill:
        return f"Skill not found: {name}"
    return skill["content"]

The key distinction: skill content is not part of the system prompt. It enters the current messages as a tool result. Subsequent calls carry it along with the history until context compaction, truncation, or session end. This naturally connects to s08's compact: on-demand loading solves "don't carry what you shouldn't", compact solves "how to drop what you should."


Changes from s06

Component Before (s06) After (s07)
Tool count 7 (bash, read, write, edit, glob, todo_write, task) 8 (+load_skill)
Knowledge loading None Two-level: startup catalog in SYSTEM + runtime load_skill
SYSTEM prompt Static string Startup scan of skills/ injects catalog
Skill registry None SKILL_REGISTRY (populated at startup, prevents path traversal)
Loop Unchanged Unchanged (skill tool auto-dispatches)

Try It

cd learn-claude-code
python s07_skill_loading/code.py

Try these prompts:

  1. What skills are available?
  2. Load the code-review skill and follow its instructions
  3. I need to do a code review -- load the relevant skill first

What to watch for: Does the Agent know available skills from the SYSTEM catalog? Does [HOOK] load_skill appear when full instructions are needed? Does the answer use the loaded skill's instructions?


What's Next

On-demand loading solved "don't carry what you shouldn't." But another problem looms: after the Agent works for 30 minutes, the messages list fills up with intermediate process. Old tool_results, stale file contents, occupying context but adding no value.

→ s08 Context Compact: A four-layer compaction strategy. Cheap layers run first, expensive layers run last.

Dive into CC Source Code

The following is based on analysis of CC source code loadSkillsDir.ts, SkillTool.ts, bundledSkills.ts, commands.ts.

1. Skill Sources: Not Just One skills/ Directory

The teaching version assumes all skills live in a skills/ directory. CC loads from multiple sources spread across multiple files: loadSkillsDir.ts handles user/project/--add-dir directories and legacy commands (.claude/commands/); bundledSkills.ts handles built-in skills; SkillTool.ts handles MCP remote skills; commands.ts handles command aggregation. Types include managed/policy skills, user skills (~/.claude/skills/), project skills (.claude/skills/), --add-dir skills, legacy commands, dynamic skills, conditional skills (with paths frontmatter, activated by file path), bundled skills, plugin skills, MCP skills.

2. SKILL.md Frontmatter — Common Fields

CC's SKILL.md YAML frontmatter is parsed by parseSkillFrontmatterFields() in loadSkillsDir.ts. Common fields include:

Field Purpose
name / description Display name and description
when_to_use Guides the model on when to invoke
allowed-tools Auto-allow list of tools available to the skill
context inline (default) or fork (run as sub-Agent)
model Model override (haiku/sonnet/opus/inherit)
hooks Skill-level hook configuration
paths Glob patterns for conditional activation
user-invocable Users can invoke via /name

The complete field list changes across versions; above are the core fields relevant to the teaching version.

3. Precise Implementation of Two-Level Loading

  1. Catalog (at startup): getSkillDirCommands() scans directory → registers as Command objects containing only metadata. getSkillListingAttachments() formats the skill list as attachments, budgeted at ~1% of the context window (cap 8000 characters).
  2. Load (on invocation): Model calls Skill tool (input fields are skill + optional args; teaching version uses name) → getPromptForCommand() expands full SKILL.md content → SkillTool returns a tool_result with display text "Launching skill: {name}", while the actual skill content is injected via newMessages. The teaching version merges both into "injected via tool_result" as a simplification.

The Teaching Version's Simplification Is Intentional

  • Multiple files and sources → 1 skills/ directory: sufficient to demonstrate the core concept of two-level loading
  • Multiple frontmatter fields → only parse name/description: reduces parsing complexity
  • Forked skills (context: 'fork') → omitted: the teaching version only expands inline skill loading
  • Skill tool input skill+args → teaching version uses name: avoids extra argument parsing complexity