# LLM-SKILL

Runnable Implementation: https://github.com/hanyuancheung/llm-skill

> This is a first-person record of how I took Karpathy's [LLM-Wiki](https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f) intuition and turned it, step by step, into the `llm-skill` project. I'll walk through: where I started, why I chose this particular layering, what each layer solves.

---

## 1. Starting point: what I actually took away from LLM-Wiki

Karpathy's LLM-Wiki throws out one core intuition:

> **Humans don't get smarter by memorizing "every conversation" — they distill conversations into entries, and next time a similar problem shows up, they look up the entry first.**

Translated into the LLM-Agent context, three signals jumped out at me:

1. **Context is not free.** Every chunk you stuff into a prompt burns tokens and burns attention. You should only pack in "the small slice this task actually needs".
2. **Experience needs a middle layer.** Raw conversations are, well, raw; model weights are frozen. What's missing in between is **an incrementally editable, structured experience layer** — i.e. the Wiki.
3. **The Wiki isn't a static doc; it's a living organism.** Somebody writes it (distilling), somebody reads it (executing), somebody curates it (routing and health). These three roles **must** be separated.

So I gave myself a design brief:

> Build a **minimal, runnable, self-evolving Skill system** that lets an Agent, on every task: **read as little as possible → do the right thing → sink whatever it newly learned back into the Wiki**.

---

## 2. My three-layer mental model: Raw / Wiki / Schema

Before writing any code I forced the whole system into three layers, and laid down a rule: **each layer only ever talks about one thing**. This is the deepest invariant of the project.

| Layer | What it is | Who writes it | Who reads it |
|---|---|---|---|
| **Raw layer** | Chat, tool output, errors, pitfalls as they happen | User + Agent, ephemeral | Only `distill` consumes it |
| **Wiki layer** | `skills/<name>/SKILL.md` | **Only `distill`** can write | `execute` lazy-loads on demand |
| **Schema layer** | `AGENT.md` (routing + contract) | **Only `guide`** can write | `execute` must read at the start of every task |

**The decision that matters most here**: I cleanly separated "domain knowledge" from "routing rules". **Not a single line of domain knowledge is allowed inside `AGENT.md`.** It answers only one question: "in what situation do I use which skill, and how do they cooperate?" Only then does the Schema stay stable, short, and trustworthy.

This model lands directly in the repo layout:

```
llm-skill/
├── AGENT.md            ← Schema (single source of truth)
├── skills/
│   ├── execute/        ← meta-skill: run tasks
│   ├── distill/        ← meta-skill: write the Wiki
│   ├── guide/          ← meta-skill: maintain the Schema
│   └── _template/      ← scaffold for domain skills
└── scripts/validate.py ← contract guard
```

---

## 3. The closed loop: Execute → Distill → Guide

With only layering, the system is static. I wanted **evolution**, so I wrapped the three layers in a loop:

```
          ┌────────────────────────────────────────────────┐
          ▼                                                │
[Execute] ──traces──► [Distill] ──skill CRUD──► [Guide]
  run tasks           refine into skills        maintain routing
  reads Schema                                  writes Schema
     ▲                                                │
     │                                                │
     └──── next task routed by the new rules ◄────────┘
```

I gave the three meta-skills **mutually exclusive write rights** — this is the single decision that keeps the whole thing from rotting:

| Meta-skill | Reads | Writes | Absolutely must not |
|---|---|---|---|
| `execute` | `AGENT.md` + up to 3 `SKILL.md` files | **Nothing** (read-only on skills) | Batch-read `skills/`; edit any skill |
| `distill` | Raw traces + existing skills | Only `skills/<name>/SKILL.md` | Touch `AGENT.md` |
| `guide` | Every skill's front-matter | Only the routing table + changelog in `AGENT.md` | Stuff domain knowledge into `AGENT.md` |

**Why make writes exclusive?**
Because an Agent that mutates its own rules while executing is a debugging nightmare. Once writes are exclusive, every behavioral change can be traced to "which meta-skill touched which file in which turn".

---

## 4. Why a skill must be "contract + body"

This is the **SKILL contract** I locked down as early as v0.1.0 (see `AGENT.md` §3):

- **front-matter (the contract)**: `name / version / status / triggers / dependencies / owner / updated` — all seven required, no exceptions.
- **body (six fixed sections)**: What / When (at least 2 positive + 2 negative examples) / How (numbered SOP) / Examples / Pitfalls / Changelog.

Two things I particularly care about:

1. **Two vital signs: `version` + `status`**
   - `version` lets me do semantic upgrades (patch / added SOP step / contract change).
   - `status` powers the **lifecycle state machine**: `experimental → active → deprecated → removed`. A skill's whole life is trackable, from birth to retirement.

2. **Every `Pitfall` must come from a real trace**
   This is a hard rule I enforce on myself: pitfalls that grow out of raw usage are the valuable ones; fabricated pitfalls are pollution. The `distill` SOP explicitly uses "can every Pitfall be stated as *if A then B*?" as a quality gate.

3. **Hard cap on lazy loading: at most 3 `SKILL.md` per task**
   This maps directly to the opening "context is not free". Step 1 of the `execute` SOP reads: "**Read only §2 of `AGENT.md`; do NOT `ls skills/` or batch-read.**"

---

## 5. How I hardened it across three iterations

This didn't land in one shot. I filled the holes version by version — this section maps directly onto the repo's `CHANGELOG.md`.

### v0.1.0 — stand up the skeleton

I only shipped the minimal loop:
- `AGENT.md`: routing table + SKILL contract + lifecycle state machine + anti-patterns.
- Three meta-skill directories: `execute / distill / guide`.
- The `_template` scaffold.

**What this version solves**: the idea runs end-to-end — the Agent reads the Schema, lazy-loads skills, and reminds itself to distill after finishing.

**What this version lacks**:
- English only; not friendly to non-English users.
- No entrypoint adapters for different Agent runtimes (Codex / Claude Code / HERMES).
- No mechanical guard whatsoever — the whole contract relied on "trust the Agent to obey".

### v0.2.0 — make it genuinely installable and verifiable

I added four things:

1. **Bilingual mirrors**: every skill ships both `SKILL.md` (English, default) and `SKILL_zh.md` (Chinese mirror). The front-matter fields `name/version/status` **must match**; the body may be translated.
2. **Thin entrypoint shells for multiple runtimes**: `AGENTS.md` (Codex), `CLAUDE.md` (Claude Code), `HERMES.md` (HERMES) — all **purely shells** that defer to `AGENT.md`. Same Schema, any runtime.
3. **`install.sh`**: one-shot layout check + validator invocation.
4. **`scripts/validate.py`**: verifies three things — front-matter shape, directory name equals `name`, and bi-directional consistency between `AGENT.md` §2 and the real `skills/` tree.

**The breakthrough of this version**: upgrade from "trust the Agent to behave" to "a mechanical guard that can reject bad commits".

### v0.3.0 — the Hook protocol: mechanize "is there anything I still need to do before I finish?"

This is the iteration I'm proudest of.

**The problem**:
After `execute` finishes a task, it should evaluate "did this run produce anything worth distilling?" But "should" is not enough — Agents will cut corners, skip evaluations, and rationalize away with "nothing worth distilling this time".

**My solution**: introduce the **Hook protocol** (`AGENT.md` §10) — three lifecycle anchor points:

| Phase | When it fires | Typical purpose |
|---|---|---|
| `pre` | After routing, before running `How` | Guard conditions; `skip` / `warn` / `proceed` |
| `post` | Before `execute` declares "done" | Distill judgement, validators, side-effect checks |
| `on_error` | On any raise or user abort | Cleanup, rollback, log extraction |

**I whitelisted the set of actions**, so the Agent cannot invent new ones:
`proceed / skip / warn / propose-distill / propose-guide / require-validator / run-script:<relpath>`

**The single most important piece: the default `should-distill` post-hook**

```yaml
post:
  - name: should-distill
    when: "distill-candidates is non-empty"
    action: propose-distill
```

This hook **auto-applies to every skill, cannot be removed, cannot be weakened**. In plain English:

> Before any task ends, the Agent **must** explicitly evaluate "do I need to distill?" Silent skipping is forbidden.

I then mechanized its semantics in `validate.py`:
- `skip` is only legal in `pre` (no sneaking out right before the finish line).
- `propose-distill / propose-guide / require-validator` are illegal in `pre` (they are closing actions).
- `run-script:<path>` must exist, must be executable, and must live inside the skill directory (path-escape guard).

And each meta-skill now declares its own hooks:
- `distill.pre.have-candidates`: if the candidate list is empty, `skip` (so we never distill nothing).
- `distill.post.propose-guide` + `distill.post.validate-skills`: after writing skills, prompt `guide` to register them, and force the validator to run.
- `guide.post.validate-after-guide`: any `AGENT.md` edit must pass the validator.

**The breakthrough of this version**: the loop no longer depends on "the Agent remembering to complete it" — **the contract enforces completion**.

---

## 6. What a single task actually looks like inside this system

Let's walk a real flow: "help me review a Go PR". This is the narrative behind the pipeline diagram in the README.

1. **Execute starts**
   - Reads only §2 of `AGENT.md`; matches keywords `review / Go / PR` to `skills/review-go-pr/` (assume it already exists).
   - Lazy-loads its `SKILL.md`; expands `references/` only if needed.
2. **Run `pre` hooks**
   - `needs-go-toolchain` detects this is indeed a Go project → `proceed`.
3. **Run `How`**
   - Walk the SOP step by step. Along the way, spot a pitfall the skill didn't mention: "Go 1.22's `range over int` is being abused in this PR."
   - Push it into `distill-candidates`.
4. **Run `post` hooks**
   - Default `should-distill` fires → list is non-empty → propose `distill`.
   - User agrees.
5. **Distill**
   - Classify: the existing skill is incomplete → **Update** `review-go-pr/SKILL.md`, add a new Pitfall "range over int abuse", bump version `0.3.0 → 0.4.0`.
   - `post.validate-skills` forces `python3 scripts/validate.py` → passes.
   - `post.propose-guide` suggests invoking `guide`.
6. **Guide**
   - Scans front-matter; only the version changed, no registry row needs editing — just append one line to `AGENT.md` §7 changelog.
   - `post.validate-after-guide` runs the validator again → passes.
7. **Next time**
   - Anyone else throwing a Go PR in will hit the freshly-added pitfall. The loop is back to the start.

---

## 7. Patterns I deliberately avoided (anti-patterns)

All of these are explicitly banned in `AGENT.md` §6. I list them here so they don't silently creep back in:

- ❌ **Stuffing domain knowledge into `AGENT.md`** → sink it into a skill, or the Schema will bloat into an unmanageable index.
- ❌ **Bypassing the routing table and grepping `skills/`** → that's asking the Agent to guess, and it kills lazy-loading.
- ❌ **Skills importing each other's internals** → declare explicitly via `dependencies`.
- ❌ **A single `SKILL.md` over 500 lines** → spill into `references/` so the main file stays readable in one pass.
- ❌ **Duplicated knowledge across skills** → Merge, or extract a shared skill.
- ❌ **Using `pre.skip` as a substitute for "narrower triggers"** → fix the triggers instead; don't patch with hooks.
- ❌ **Silencing the default `should-distill` just because it's "nagging"** → that's exactly the lifeline of the loop.

---

## 8. Where this can evolve next (my roadmap)

I intentionally parked the project at v0.3.0, because going further requires **usage data**, not more design. I left room for a few future moves:

1. **Real domain-skill distillation**: today `skills/` contains only three meta-skills + the template; there are no real domain skills (`review-go-pr`, `debug-cuda-oom`, etc.). That needs the system to actually run for a while and let `distill` produce something real.
2. **A/B evaluation of `triggers`**: current triggers are just naive keyword + intent-string matching. The next step is to replay historical queries and use real hit-rates to prune dead triggers (the `guide` SOP already reserves this slot).
3. **Cross-repo skill sharing**: today a skill set belongs to a single repo. Could we stand up a skill registry so multiple projects share them? That requires `dependencies` to support cross-repo addressing — a big topic on its own.
4. **A script ecosystem around hooks**: `run-script:<relpath>` already supports local scripts, but there's no shared standard library yet. I could build a `hooks/` commons (e.g., "a `go-vet` post-hook shared by all Go skills").

---

## 9. Summary: what I'm actually shipping

In one sentence:

> I've built a minimal, runnable, self-evolving LLM Skill system: **a three-layer mental model (Raw / Wiki / Schema) + three meta-skills with mutually exclusive write rights (Execute / Distill / Guide) + a set of mechanical guards (contract + validator + Hook protocol)** — so the Agent can "read as little as possible" while "learning as much as possible".

The value isn't "yet another skill format". It's:

- **Grounding the LLM-Wiki intuition in a verifiable contract.**
- **Minimizing the parts that depend on "the Agent behaving well"** — everything else is mechanically guaranteed by front-matter, validator, and the Hook protocol.
- **Letting any Agent runtime (Codex / Claude Code / HERMES / custom) plug in via a thin shell file**, without being locked to any implementation.

---