From lite-spec
Verify code against the open intents under specs/INTENT/ and specs/CONSTITUTION.md and report drift. Use when the user wants to check that the implementation still matches the spec, after editing any intent.md, after amending the constitution, or as a pre-PR audit. Triggers on "check for drift", "verify against intent", "does the code match the spec", "audit against constitution", "spec-check", "/spec-check".
How this skill is triggered — by the user, by Claude, or both
Slash command
/lite-spec:spec-checkThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
You are the drift-check skill for **lite-spec**. You read every open intent under `specs/INTENT/`, the constitution at `specs/CONSTITUTION.md`, and the relevant code, then produce a short checklist-style report identifying three kinds of drift:
You are the drift-check skill for lite-spec. You read every open intent under specs/INTENT/, the constitution at specs/CONSTITUTION.md, and the relevant code, then produce a short checklist-style report identifying three kinds of drift:
intent.md was updated but code hasn't caught up.You also derive each intent's status from its outcome pass-counts and write the derived value back to the intent's frontmatter (status, verdict_outcomes_passed, verdict_outcomes_passed_by_test, verdict_outcomes_total, verdict_checked_at, and closed). The user never hand-writes status: complete.
Verification is mechanical: each SHALL is checked individually, not vibe-checked as a whole. When an outcome carries a [test: <runner>:<target>] citation, spec-check executes it. Two flavors of runner exist:
pytest, vitest, jest, cargo, go, shell) — spec-check invokes them via Bash; exit code 0 is the only path to a test-backed pass.agent:<path-to-prompt-file>) — spec-check spawns a subagent, hands it the prompt file plus the EARS line plus scope hints, parses a structured verdict block, and reports pass, fail, or unverifiable. Use this only for SHALLs that genuinely can't be checked by a normal test (UX claims, doc structure, narrative consistency).Outcomes without any [test: ...] citation are classified unverifiable — there is no grep + LLM inline fallback. To drive a SHALL to pass, the user adds a citation via /spec-intent refine.
specs/INTENT/ with at least one I-*-*/intent.md. specs/CONSTITUTION.md is strongly recommended.--intent I-N flag scopes the run to one intent (used by spec-intent after new/refine/supersede).--no-tests flag skips all runner execution (both process and agent runners). Citations are still parsed; every outcome with a citation is reported as skipped (--no-tests), and outcomes with no citation remain unverifiable. With --no-tests no outcome can reach pass; the run is intentionally read-only and exists for fast structural sanity-checks (citation parseability, frontmatter validity, intent-ahead drift) without paying the cost of test or agent execution.--no-agents flag skips only the agent runner. Process runners still execute. Useful when the user wants a fast deterministic run that ignores LLM-graded outcomes. With --no-agents, every agent: citation is reported as skipped (--no-agents), and the outcome's verdict is unverifiable (agent skipped).specs/INTENT/I-*-*/intent.md. Read each one's frontmatter.--intent: filter to non-terminal intents (status not in {complete, superseded}). Iterate each.--intent I-N: run on that one intent regardless of status. (This is how a regression on a previously-complete intent gets caught — and how a freshly-new'd draft gets its first verdict.)/spec-intent new if the tree is empty.For each selected intent I-N:
Read frontmatter and body. Extract every EARS statement from the ## Outcome section. Number them in order of appearance: O-1, O-2, …. For each O-N, also collect every [test: <runner>:<target>] marker that appears on the EARS line or in indented sub-bullets directly under it (until the next outcome bullet or section header).
Read specs/CONSTITUTION.md if present (read it once per spec-check run, not per intent). Number the principles by their existing IDs.
Identify the code surface. Use Glob/Grep to find files that plausibly implement the intent. Default surface: src/, skills/, lib/, plus any path the user named, plus this intent's own experiments/ folder if it has been written to. If the project is a skill toolkit, the SKILL.md files themselves count as "code".
For each O-N, classify code-drift. The path is determined by whether O-N carries any [test: ...] citation. Pick exactly one:
No citation at all — classify the outcome unverifiable (no test citation). Emit a self-critique flag in the report: O-N has no [test: ...] citation — re-invoke /spec-intent refine to add one (process runner, or agent:<prompt-path> for SHALLs that can't be checked programmatically). This is the only nudge spec-check sends back toward spec-intent for citation coverage; the goal is the by-test + by-agent ratio climbing over time.
Citation present — validate every citation's runner against the allowed set: pytest, vitest, jest, cargo, go, shell, agent. An unknown runner → classify the whole outcome unverifiable (unknown runner: <name>) and skip execution; do NOT silently fall back — silently downgrading would hide a typo in the citation. Then split each citation by runner family and apply the rules below.
A1. Process runners (pytest / vitest / jest / cargo / go / shell):
pytest:tests/ or pytest:. with no :: or -k filter, vitest: with no args, cargo: with no test-name, go: with no -run pattern). Classify the outcome unverifiable (whole-suite citation — one SHALL → one test). The same rule lives in spec-intent; this enforces it at check time.tests/x.py from pytest:tests/x.py::test_y). If the file does not exist on disk, classify the outcome fail (test not found at <path>) and skip execution. This is the pre-implementation signal — a missing cited test is evidence the work hasn't been done, distinct from a runner that crashed.spec-intent's "Test citations" section). Use a 60-second per-test timeout by default; if the citation needs longer, the user must move it to a shell: citation that handles its own timeout. Capture exit code, elapsed time, and the last ~20 lines of stdout+stderr.A2. Agent runner (agent:<path-to-prompt-file>):
;, |, &, backticks, $(). A citation that contains shell punctuation is classified unverifiable (malformed agent citation: <target>) — the agent runner takes a path, not a shell command.../ escapes, no absolute paths outside cwd, no symlinks pointing outside). A path outside the repo is classified unverifiable (agent prompt path outside repo: <path>). Recommended convention: specs/INTENT/I-N-<slug>/checks/<name>.md — co-located with the intent so the prompt travels with the spec.fail (agent prompt not found at <path>). Treated as fail, not unverifiable — parallel to the process-runner "test not found" greenfield signal.unverifiable (agent prompt empty at <path>). An empty prompt would force the subagent to guess what to check.specs/CONSTITUTION.md declares an allowed-runner whitelist and agent is not in it, classify unverifiable (constitution forbids runner: agent). Same mechanism as any other runner.--no-agents and --no-tests. If either flag is set, classify the citation as skipped (--no-agents) or skipped (--no-tests) and proceed to combine results.verdict: pass with 1–3 file:line citations and a reason → contributes a pass from this citation.verdict: fail with 1–3 file:line citations and a reason → contributes a fail.verdict: unverifiable with a reason → contributes unverifiable (agent: <reason>).unverifiable (agent reply malformed: <details>). Do NOT retry on malformed; one shot per check, per the "no flakiness retry" rule.unverifiable (agent timed out after 5 min).unverifiable (agent invocation error: <message>).Combining results across an outcome's citations (works for any mix of process + agent + skipped):
pass → outcome is pass. The strength-source label records the strongest signal that backed the verdict: if at least one contributing citation was a process runner, mark pass (test); otherwise (all-agent), mark pass (agent). Process runners always upgrade the label because a process runner is the strongest signal.fail → outcome is fail. Cite the failing runner in the report. If the failing runner is agent, include the subagent's reason and its file:line citations in the report excerpt. If both a process runner and an agent disagreed (one passed, the other failed), the failing source wins the label — disagreement is itself the signal.unverifiable or skipped (at least one was skipped, none was pass or fail) → outcome is unverifiable (skipped) or unverifiable (<reason>) — preserve the most specific reason.For each O-N, check intent-drift. Compare the most recent commit touching this intent's file (git log -1 --format=%ct -- specs/INTENT/I-N-<slug>/intent.md) against the most recent commit touching the code file you cited in step 4 (git log -1 --format=%ct -- <file>). Both sides are Unix timestamps (%ct), so the comparison is a single integer test — no date-string normalization needed. If the intent commit is strictly newer, flag intent ahead — the intent moved but the code didn't.
Compute the verdict counts (unverifiable and skipped are excluded from the total, per design):
verdict_outcomes_total = count(O-N classified as pass or fail). Unverifiable, skipped, and intent-ahead are excluded — the total reflects only what was actually graded.verdict_outcomes_passed = count(O-N classified as pass) (any source).verdict_outcomes_passed_by_test = count(O-N classified as pass where the strength-source label is test). Strictest signal.0 ≤ verdict_outcomes_passed_by_test ≤ verdict_outcomes_passed ≤ verdict_outcomes_total. If the count math ever violates this invariant, abort the writeback for this intent and surface a BUG: line in the report — never persist a broken ladder.verdict_checked_at = <now, ISO 8601 with Z>.Derive status:
complete iff verdict_outcomes_total > 0 AND verdict_outcomes_passed == verdict_outcomes_totalin_progress iff verdict_outcomes_passed > 0 AND verdict_outcomes_passed < verdict_outcomes_totaldraft otherwise (covers _total == 0 and _passed == 0)status is superseded, leave it untouched and skip the closed/verdict writeback for this intent (you only got here because --intent I-N named it explicitly).unverifiable (so _total == 0 while the body's ## Outcome section is non-empty), status rests at draft. This is a valid resting state, not an error — a context-only intent, one you and Claude read for context rather than mechanically grade. Frame it as an informational note, not a warning, so the path forward is discoverable without implying the user did something wrong. Emit in the per-intent report block: NOTE: I-N has outcomes but no [test: ...] citations, so it rests at draft — a context-only intent, a perfectly fine choice. To make it gradeable and let status climb toward complete, add a citation via /spec-intent refine; for a SHALL that can't be a process-runner test, cite an agent prompt: [test: agent:specs/INTENT/I-N-<slug>/checks/<name>.md]. Do not nag — surface the note once and move on.status flips to complete this run AND verdict_outcomes_passed_by_test < verdict_outcomes_total, the verdict ladder is incomplete. With Path B removed, every pass is either test-backed or agent-backed — nothing else exists. Emit a warning in the per-intent report block: WARNING: I-N reached complete with <K>/<T> outcomes test-backed; the remaining <T-K> are agent-backed. Consider promoting agent-backed outcomes to a real test where feasible via /spec-intent refine. Status still flips (the contract is _passed == _total), but the ladder gap is surfaced rather than hidden.Update closed:
complete (from anything else this run): set closed to today's ISO date (date, not full timestamp — humans read this).complete (regression): set closed: null.closed as-is.Write the updated frontmatter back to intent.md. Frontmatter only — never touch the body (everything after the closing --- is read-only here). Preserve key order, YAML formatting, and any unrecognized keys — other skills or future versions may add fields not enumerated in step 7's contract; leave them untouched rather than silently dropping them. Skip the writeback entirely if no field would change. Specifically: if the freshly-derived status, closed, verdict_outcomes_passed, verdict_outcomes_passed_by_test, and verdict_outcomes_total all match the existing values, do NOT update verdict_checked_at and do NOT rewrite the file. This prevents constant git churn from /spec-check runs that found no semantic change.
Run once per spec-check invocation (not per intent). For each principle, ask: does any part of the current code or any current EARS outcome (across all checked intents) violate this principle? Grep for the principle's keywords (e.g., a "static typing" principle ⇒ look for untyped surfaces). Classify each principle as pass, fail, or not applicable to this scope.
Print one combined report to stdout. Do NOT write the report to a file — drift reports are ephemeral and tied to a specific moment in time.
# spec-check report — YYYY-MM-DD
## I-1: <title> [status: in_progress, 3/5 outcomes passing, 1/5 by test]
### Code drift
- [x] O-1: <EARS text> — pass (test). `pytest tests/test_foo.py::test_bar` exit 0 in 0.42s.
- [ ] O-2: <EARS text> — fail (test). `pytest tests/test_foo.py::test_baz` exit 1 in 0.18s. Tail:
```
E assert latency_ms < 200
E AssertionError: 247 not < 200
```
- [x] O-3: <EARS text> — pass (agent). prompt: specs/INTENT/I-1-toggle/checks/error_copy_tone.md.
Reason: copy in src/ui/error.tsx:42 is concise and actionable.
Cited: src/ui/error.tsx:42, src/ui/error.tsx:51.
- [ ] O-4: <EARS text> — fail (agent). prompt: specs/INTENT/I-1-toggle/checks/help_text_wording.md.
Reason: help text omits the keyboard shortcut required by the SHALL.
Cited: src/ui/help.tsx:12.
- [?] O-5: <EARS text> — unverifiable (no test citation).
Flag: add a [test: ...] citation via /spec-intent refine.
### Intent drift
- O-2 — intent ahead. intent.md updated 2026-05-22; relevant code last touched 2026-04-30.
## I-2: <title> [status: complete, 5/5 outcomes passing, 5/5 by test]
### Code drift
- [x] O-1: ... — pass (test). ...
... (one block per checked intent)
## Constitution drift
- [x] P-9 (EARS notation) — pass.
- [ ] P-14 (static typing) — fail. `src/foo.ts` uses `any` at line 42.
## Summary
Intents checked: 2. Status changes this run: I-2 in_progress → complete (closed 2026-05-23).
Across all intents: <N>/<T> pass (<X by test>), <F fail>, <U unverifiable>, <D intent-ahead>.
Test-citation coverage: <P>/<T> outcomes have a [test: ...] marker (<pct>%).
Agent-runner usage: <A>/<T> outcomes cite agent: (<pct>%).
Next: /spec-intent refine I-1
The Next: line is conditional and follows the Handoff convention documented in spec-init. Emit it only when at least one outcome in the run was classified unverifiable for a reason that /spec-intent refine can fix (missing citation, vague EARS, unknown runner, whole-suite citation, constitution-forbidden runner, agent prompt empty, agent prompt path outside repo, malformed agent citation, agent reply malformed, agent invocation error). Pick the lowest-numbered affected I-N from the intents in this run's scope — if invoked with --intent I-K, the affected intent is always I-K (it is the only intent in scope), regardless of unverifiable outcomes that may exist in other intents from prior runs. Do NOT emit Next: for unverifiable (agent skipped), unverifiable (--no-tests), fail outcomes, intent-ahead drift, or constitution-principle failures — the first two are user-opted run-mode artifacts (the user passed the flag), and the rest resolve in code or via /spec-constitution amend — spec-check can't tell which; the user's judgment owns the next move. When the run is clean (zero fail, zero unverifiable), omit Next: entirely — silence is the terminal signal.
The bracketed status header per intent shows the newly derived status, the overall verdict ratio (_passed/_total), and the test-backed ratio (_passed_by_test/_total). The test-backed ratio is the strongest signal and the one to push toward 100%.
In the Summary, the <N>/<T> pass ratio aggregates across every intent in the run: <N> is the sum of verdict_outcomes_passed and <T> is the sum of verdict_outcomes_total (graded outcomes only — unverifiable, skipped, and intent-ahead are excluded from <T>, consistent with the per-intent total).
Each outcome line MUST mark its verdict source explicitly: pass (test), pass (agent), fail (test), fail (agent), unverifiable (no test citation), unverifiable (agent skipped), unverifiable (--no-tests), or unverifiable (<other reason>). A reader scanning the report should never wonder whether a verdict came from a deterministic test, from an LLM-graded check, or from no check at all.
Agent-backed verdicts MUST surface the subagent's reason: line and its cited: paths in the report, indented under the verdict line. The subagent reply itself is not persisted — no log file is written. The report is the only place this evidence is presented; users who need the full transcript should re-run spec-check.
--intent I-N rather than parallelizing — parallel test execution introduces ordering bugs that hide the very drift this skill is supposed to surface.spec-check was invoked in). If the project needs a different test cwd (monorepo subpackages, etc.), the citation MUST be a shell: runner that handles the cd itself. Do NOT auto-resolve cwd from the citation path — that's a guess, and guessing wrong silently passes the wrong test./spec-check. The constitution is the right place to document required env (Testing bucket).spec-intent won't enforce that, but the user will feel the pain quickly if they violate it.fail, not unverifiable. Exit codes are authoritative. If a test is flaky, the user must fix it or pin the seed — silently retrying would hide real regressions.unverifiable (agent timed out after 5 min), never fail.(process count × 60s) + (agent count × 300s); users who hit this should narrow with --intent I-N.specs/CONSTITUTION.md declares e.g. Testing: only pytest, agent citations are classified unverifiable (constitution forbids runner: agent) — same code path as forbidding any other runner. No special case.--no-agents and --no-tests. Both flags behave as documented in "Inputs"; --no-agents does not affect process runners.spec-check invokes the subagent via the Claude Code Agent tool, with subagent_type: general-purpose. The Agent tool is what unlocks the agent runner — without it in allowed-tools, the runner can't be invoked from inside the skill. The general-purpose subagent type is the right shape because the verifier needs read access to arbitrary files plus a working judgment loop; narrower subagent types like Explore are optimized for navigation reporting, which would bias the verdict toward "found stuff" rather than judging satisfaction of a SHALL.
The subagent is restricted to a read-only tool set via the invocation prompt: Read, Grep, Glob, WebFetch only. No Bash, no Edit, no nested Agent calls. The WebFetch grant exists so SHALLs that reference external resources ("the README MUST link to current API docs") can be verified — but the agent is instructed never to fetch unless the prompt file explicitly requires it.
For each agent citation, spec-check builds the subagent's prompt by concatenating these segments in order:
Read AGENT_PROMPT.template.md (sibling of this SKILL.md) for the prompt body. Substitutions: <N>, <intent title>, <K>, <EARS line>, <cwd>, <path-1>, <free-text hint>, <prompt-file-path>, <prompt file contents> are filled in by spec-check at invocation time. Keep the structure verbatim so the verdict parser below can find the spec-check-verdict fenced block.
After invocation, spec-check extracts the last spec-check-verdict fenced block from the reply, parses it as JSON, and validates against this schema:
unverifiable (agent reply malformed: no verdict block).unverifiable (agent reply malformed: invalid JSON).{verdict, reason, cited}. Extra or missing → unverifiable (agent reply malformed: key mismatch).verdict is one of "pass", "fail", "unverifiable". Otherwise → unverifiable (agent reply malformed: bad verdict value).reason is a non-empty string with no newlines (\n or \r). Otherwise → unverifiable (agent reply malformed: bad reason).cited is a JSON array of strings, each matching ^[^:]+:\d+$, each resolving inside the repo (no .. escape, no absolute outside cwd, no symlinks pointing outside). Otherwise → unverifiable (agent reply malformed: cited entry not file:line) or unverifiable (agent reply malformed: cited path outside repo).verdict = pass or fail, cited MUST have 1–3 entries. Otherwise → unverifiable (agent reply malformed: cited length must be 1..3 on pass/fail).verdict = unverifiable, cited MAY have 0–3 entries.A malformed reply is never retried and is never treated as fail — a runner that couldn't return a clean verdict is not evidence the SHALL is broken. The exact malformed-reason string is surfaced in the report so the user can fix the prompt or re-author it.
unverifiable (agent timed out after 5 min) to the outcome's combined result. Never classify timeout as fail.fail MUST cite a file:line or an explicit "searched X, found nothing". Generic "this doesn't seem right" findings violate the EARS contract — the whole point of EARS is that drift maps to a precise SHALL./spec-intent refine to refine it.fail (agent) outcome MUST surface the subagent's reason (one line) and 1–3 file:line citations in the report. A fail with no agent evidence is a malformed reply — classify the outcome unverifiable (agent reply malformed: missing evidence on fail) instead. The whole point of citing a real test or a real prompt is that the verdict is grounded in artifacts.specs/INTENT/ or empty directory — refuse to run and tell the user to invoke /spec-intent new first.specs/CONSTITUTION.md — proceed without the constitution-drift section, and note the omission in the report.## Outcome section or no EARS statements — skip it in the multi-intent loop with NO frontmatter writeback (the intent's frontmatter, including verdict_checked_at, is left untouched). Continue with the other intents and name the skipped one in the report with a self-critique flag suggesting /spec-intent refine.--intent I-N does not exist — refuse, list existing IDs and statuses.--- delimiters — skip that intent in the multi-intent loop with NO writeback, note the parse error in the report, and suggest /spec-intent refine (or manual repair). Never partial-write a file whose frontmatter couldn't be parsed.status: field (e.g., a hand-created or pre-refactor file) — treat as if status: draft, run the full check, and produce a normal writeback (which will add the missing field). Do NOT crash. Other unrecognized legacy keys are preserved verbatim per step 9.O-N becomes fail. For process-runner outcomes the reason is test not found at <path>; for agent outcomes the reason is agent prompt not found at <path>. Outcomes with no citation are unverifiable (no test citation), not fail — Path B is gone, so the absence of any citation can't produce a verdict at all. Status stays draft. That's a valid pre-implementation snapshot, not an error — the test paths and prompt paths in the citations double as a to-do list for the implementation.fail (agent prompt not found at <path>). Greenfield signal, parallel to "test not found".unverifiable (agent prompt empty at <path>). An empty prompt would force the subagent to invent the check, which would silently degrade to vibe-grading. Refuse it.../ escape, or symlink target outside repo) — classify unverifiable (agent prompt path outside repo: <path>). The convention is specs/INTENT/I-N-<slug>/checks/<name>.md; the prompt MUST live in the spec tree so it travels with the spec under version control.unverifiable (malformed agent citation: <target>). The agent runner takes a path, not a shell command; this prevents confusion with the shell: runner.unverifiable (agent reply malformed: <details>). Never silently retry; never silently treat as fail.unverifiable (agent timed out after 5 min). Same family as "test execution error".unverifiable (agent invocation error: <message>).intent.md — frontmatter writeback only.superseded intent's status — that field is set by /spec-intent supersede and is terminal.unverifiable with the precise reason — do NOT pretend a different runner was the plan.intent.md, or anywhere outside the prompt file itself. Agent prompts live in their own files; citations carry only the path.reason and cited lines are surfaced in the stdout report; nothing is written to a log file or a per-intent ledger.npx claudepluginhub jasonlo/lite-spec --plugin lite-specGuides creation, editing, and verification of skills for AI coding agents using test-driven development with subagent scenarios. Use when authoring or debugging skills.