From ngmeyer-skills
All-in-one skill for CLAUDE.md files. Two modes: `audit` finds drift (claimed facts no longer matching code), leaked secrets, duplicates, instruction-budget bloat, and prescriptive-vs-descriptive imbalance across all CLAUDE.md files in your projects. `improve` measures one CLAUDE.md against Anthropic's official best practices (200-line budget, removability test, emphasis tuning, 3-tier hierarchy) plus community-validated guidance, then proposes concrete rewrite diffs applied only after user approval. Default behavior auto-detects: in a project with a CLAUDE.md → improve mode; otherwise → audit all. Use when: 'audit claude.md', 'check claude md drift', 'lint CLAUDE.md', 'claude md audit', 'refresh instructions', 'improve CLAUDE.md', 'restructure CLAUDE.md', 'tune CLAUDE.md', 'apply CLAUDE.md best practices', 'is my CLAUDE.md good', 'CLAUDE.md is too long', 'rebalance CLAUDE.md', or quarterly as a hygiene check.
How this skill is triggered — by the user, by Claude, or both
Slash command
/ngmeyer-skills:claude-md [improve|audit] [project name | 'all' | path to CLAUDE.md] — defaults to auto-detect[improve|audit] [project name | 'all' | path to CLAUDE.md] — defaults to auto-detectThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
> **Naming exception (documented):** skillforge's checklist forbids `claude` or `anthropic` in a skill name. This skill is a deliberate exception: it operates on the literal `CLAUDE.md` artifact, so the precision of the name is the value. Treat this as the only such exception in the library.
Naming exception (documented): skillforge's checklist forbids
claudeoranthropicin a skill name. This skill is a deliberate exception: it operates on the literalCLAUDE.mdartifact, so the precision of the name is the value. Treat this as the only such exception in the library.
Two modes, one skill. Audit finds problems across many CLAUDE.md files (hygiene). Improve rewrites one file against best-practice rubric (structure). Both ground in the same core principle.
The removability test: "For every line in CLAUDE.md, ask — if I removed this, would Claude make a mistake? If not, remove it." — Anthropic, Claude Code Best Practices
This is the unifying rule. Audit mode flags lines that fail the test as drift / waste / duplicates. Improve mode proposes their removal. Same diagnostic, different scale.
A second invariant cuts across both modes: CLAUDE.md is tracked in git and visible to every agent that opens the repo. Pasted secrets are durable leaks. Both modes treat secret-leak detection as a P0 finding.
| Invocation | Mode | What runs |
|---|---|---|
/claude-md (no args, in a project with ./CLAUDE.md) | auto → improve | Improve the local CLAUDE.md |
/claude-md (no args, no local file) | auto → audit | Scan all CLAUDE.md files under your projects roots (e.g. ~/Projects) |
/claude-md improve [path] | improve | Single-file structural rewrite (10-rule rubric) |
/claude-md audit [project|all|path] | audit | Hygiene scan: secrets P0, drift, duplicates, budget |
If both modes would apply (e.g., /claude-md improve in a folder with no CLAUDE.md), the skill asks for clarification.
AGENTS.md (V2)AGENTS.md is the cross-agent standard (Linux Foundation / Agentic AI Foundation; read by Codex, Cursor, Gemini, Copilot, and 30+ tools). Everything here — the rubric, the secret scan, drift detection, the budget — applies to AGENTS.md identically. When both exist, audit both and flag duplication: the right pattern is one source of truth (AGENTS.md) with CLAUDE.md as a thin alias/@import, not two drifting files. Treat a repo's nearest-scoped AGENTS.md (monorepos nest them) the same way.
Measure one CLAUDE.md against Anthropic's official best practices + community-validated guidance, then propose concrete rewrite diffs. Apply only after user approval.
One empirical caveat (Claude Code specifically). When the target file is short (<100 lines) and already aligned with Karpathy-style rules, adding more rules can regress Claude Code quality. Augment Code (April 2026) tested Karpathy rules across Auggie, Claude Code, and Codex on 40 OpenClaw PRs: speed and cost improved on all three (−3% to −8% on duration and tool calls), but Claude Code quality dropped 0.07 overall (correctness −0.07, completeness −0.06, code reuse −0.05). Auggie and Codex were stable. Their hypothesis: Claude Code's system prompt already encodes similar constraints; further layering reduces exploration. When the rubric scores 9/10 or 10/10 on a Claude Code target, the default recommendation is trim, not add.
Ten checks, each backed by a primary source. Every recommendation cites the rule.
| # | Rule | Source |
|---|---|---|
| R1 | Length under 200 lines. Longer reduces adherence. | Anthropic — How Claude remembers your project |
| R2 | Removability test on every line. | Anthropic — Claude Code Best Practices |
| R3 | Specificity: instructions concrete enough to verify. No "be a senior engineer." | Anthropic — Memory docs |
| R4 | Emphasis on load-bearing rules. IMPORTANT: / YOU MUST / NEVER. Use sparingly — if every rule is IMPORTANT, none are. | Anthropic — Best Practices |
| R5 | Markdown structure: headers + bullets. Not dense paragraphs. | Anthropic — Memory docs |
| R6 | 5 canonical sections present (Commands, Architecture, Rules, Workflow, Out-of-scope). | Community consensus + Anthropic /init template |
| R7 | Hard Rules section ≤15 items. Beyond that, rules drop. | Community (zodchiii thread, 1.3M views) |
| R8 | 3-tier hierarchy used correctly. Across files: global rules in ~/.claude/CLAUDE.md, project in ./CLAUDE.md (git), personal in ./CLAUDE.local.md (gitignored). No duplication across tiers. Within a file: order by priority — hard non-negotiables at top, context-dependent rules middle, references/conveniences bottom. The instruction budget compresses lower-priority items first. | Anthropic — Best Practices; Fraser (Medium, May 2026) |
| R9 | Path-scoped rules in .claude/rules/*.md when instructions only apply to certain files. | Anthropic — Advanced Patterns PDF |
| R10 | No content auto memory will capture, and no standard-tool documentation. Don't waste lines on stack details Claude figures out from package.json, nor on standard tools Claude already knows (git, gh, npm, pnpm, bun, cargo, python, node, tsc, eslint, prettier, make). Document only custom wrappers or non-obvious project-specific invocations. | Community + Anthropic auto-memory docs; Fraser (Medium, May 2026) |
~/.claude/CLAUDE.md (if improving a project file)./CLAUDE.local.md (if present)./.claude/rules/*.md (path-scoping in use?)Run the 10-rule rubric programmatically where possible:
LINES=$(wc -l < "$TARGET") # R1
grep -c -iE 'IMPORTANT|YOU MUST|NEVER' "$TARGET" # R4
grep -c '^#' "$TARGET" # R5
# R10: cross-reference against package.json / pyproject.toml / Cargo.toml / etc.
Produce a per-rule scorecard with line-level evidence.
Three categories of change:
Deletions (lines failing R2 or R10):
- DELETE line N: "[content]"
Reason: [R2 — would Claude actually make a mistake without this?]
OR: [R10 — auto memory captures this from package.json]
Additions (missing canonical sections per R6):
+ ADD section "## [Section Name]" with:
[proposed content based on actual project — read package.json, README, .git/config]
Rewrites (specificity R3 + emphasis R4):
~ REPLACE line N: "[vague content]"
WITH: "[concrete, verifiable rewrite]"
Reason: [R3: was vague | R4: high-impact rule needs IMPORTANT prefix]
## CLAUDE.md Improvement Proposal
**File:** [path]
**Current size:** N lines (R1 budget: 200)
**Rubric pass rate:** M of 10
### Summary
- DELETE: K lines
- ADD: L sections
- REWRITE: J lines
- Net change: ±N lines (final: M lines)
### Proposed Diff
[full diff with reasons]
Apply these changes? (yes / partial / no)
Use the Edit tool for surgical changes — never overwrite the whole file with Write. Apply one change at a time. After all changes:
/claude-md audit afterward to verify no drift introduced.".claude/rules/<topic>.md (R9).A CLAUDE.md scoring well on R6 contains these sections. Suggest creating any that are missing:
## Project (1–2 lines: what this is, who uses it)## Stack (1–3 lines: framework, language, deployment target)## Commands (Build / Dev / Test single / Test all / Lint / Type check — short, exact)## Architecture (folder → purpose mapping; not full directory listing)## Rules (under 15 items; negative rules count; emphasis on the load-bearing one)## Workflow (how the user wants Claude to approach tasks: minimal changes, ask vs act, commit conventions)## Out of scope (files/integrations Claude should not touch)A menu to draw from when a file is genuinely missing a scope-control or safety rule. Subject to the trim-not-add caveat above: on an already-aligned Claude Code target scoring 9–10/10, do not bulk-add these — Claude Code's system prompt already encodes most of them, and layering regresses quality (Augment Code, April 2026). Add the one or two that close a real, observed gap; skip the rest.
Scope & safety (highest leverage — prevent expensive, hard-to-revert mistakes):
Only modify files, functions, and lines directly related to the current task. Do not refactor, rename, or reformat anything I did not ask you to change. Note other issues at the end; don't touch them.Before any change that significantly alters existing content (rewriting sections, restructuring, changing tone): stop, describe what you're about to change and why, wait for confirmation.Before deleting a file, overwriting code, dropping records, or removing dependencies: stop, list what will be affected, ask for explicit confirmation in the current message. "You mentioned this earlier" is not confirmation.Production hard-stops requiring in-session confirmation: deploys/pushes, migrations or schema changes, outbound API calls, any command with irreversible side effects. (If a stop must hold 100% of the time, a PreToolUse hook is the real enforcement — CLAUDE.md compliance ceilings around 80%.)After any coding task, end with: files changed, one-line summary per file, files intentionally not touched, follow-up needed.NEVER commit .env files or secretsNEVER run git push --force without explicit confirmationWorkflow:
IMPORTANT: run type check after every code changeMake minimal changes, don't refactor unrelated codeCreate separate commits per logical change, not one giant commitWhen unsure between two approaches, explain both and let me chooseFor architecture decisions or non-trivial features: work through the problem step by step before writing code. Show reasoning and where you're uncertain, then implement.Communication (lowest marginal value on Claude Code — its system prompt already does most of this; add only if you observe the specific failure):
Match response length to task complexity. Don't pad with restatements of the question or closing summaries.If uncertain about a fact, statistic, date, or technical detail, say so explicitly rather than filling the gap with plausible-sounding content.Wording sourced from the Karpathy-derived rule compilations (Fraser, May 2026; "Dep" thread, May 2026). Their headline stats ("65% → 94% accuracy", per-developer dollar figures, star counts) are illustrative marketing, not measured results — cite the rule wording, not the numbers.
@-imports of full README / huge docs (they enter the context window at launch)/init-and-forget or paste an LLM-generated context file. Controlled study: a curated context file gives ~+4pp task quality at ~20% token overhead, but an auto-generated one reduces task success ~0.5–2% while raising cost 20–23%. Curate ruthlessly; more context is not better. (arXiv 2026, AGENTS.md efficiency study.)Lint and audit CLAUDE.md files across all projects. Flags drift (claimed facts no longer matching code), leaked secrets, duplicate blocks, descriptive-vs-prescriptive line balance, and files approaching the 150-200 instruction budget.
CLAUDE.md files rot. In practice, a portfolio audit will routinely surface several projects with drifted CLAUDE.md — and occasionally one with a leaked secret. This mode catches both shapes in one pass.
CLAUDE.md under your projects roots (adjust to where you keep code, e.g. ~/Projects, ~/work):
find ~/Projects ~/work -name 'CLAUDE.md' \
-not -path '*/node_modules/*' -not -path '*/.venv/*' -not -path '*/vendor/*' 2>/dev/null
For each CLAUDE.md, grep for high-confidence secret patterns. This phase runs before any other audit because a leak is a P0 finding that interrupts the rest of the flow.
Patterns to flag:
(?i)(secret|token|api[_-]?key|password)\s*[:=]\s*['"]\S{8,}['"](?i)auth[_-]?secret\s*[:=]\s*\S{8,}eyJ[A-Za-z0-9_-]{10,}https?://[^:/\s]+:[^@\s]+@sk_live_, sk_test_, pk_live_, sk-ant-, sk-proj-postgres(ql)?://[^:]+:[^@]+@If any match: mark file 🔴 P0 LEAK. Report the line, pattern matched, required remediation: rotate the secret in source system, edit CLAUDE.md to reference env var names only, scrub git history (git filter-repo --path CLAUDE.md --invert-paths or BFG).
For each claim CLAUDE.md makes, verify against real code:
settings.json / .mcp.json / mcp.jsonsrc/app/api/ or routes file*/cron* files / vercel.json crons / wrangler.toml triggers.env.example or be read by code (process.env.<NAME> / os.environ["<NAME>"])@getbrevo/brevo REST client)ls outputFor each drift finding: report the CLAUDE.md line, the actual code fact, and the one-line edit that fixes it.
^## , ^### appearing ≥2 times)200: 🟠 over budget — suggest splits or cuts
CLAUDE.md instructions are advisory — community evidence puts compliance around 70–80%. Rules phrased as absolute commands fail this ceiling silently. Scan for deterministic-sounding rules and flag them for hook conversion.
Patterns to flag:
NEVER, Never, Always, MUST, Do not followed by a verbrm -rf, git push --force, git reset --hard, DROP TABLE, truncate, vercel --prod, gcloud ... deleteFor each match: report the line, classify as hook-candidate, and identify the right hook type:
PreToolUse hook returning exit code 2 to blockPreToolUse on git commit or PostToolUse on file Edit/WritePostToolUse hookReport wording: "This rule reads as deterministic but lives in advisory territory. Convert to a hook in .claude/settings.json for 100% enforcement; keep the CLAUDE.md line as documentation of the hook's intent, or remove it." Cite Anthropic hooks docs and the ~80% advisory ceiling.
Output grouped by severity, per project:
# CLAUDE.md Audit Report — <timestamp>
## Summary
- 6 CLAUDE.md files scanned
- 1 P0 LEAK (acme-web)
- 2 projects with drift (data-pipeline, saas-app)
- 0 over budget
- 3 projects clean
---
## acme-web 🔴 P0 LEAK
**File:** ~/Projects/acme-web/CLAUDE.md (142 instructions)
### Leaks (P0 — rotate before any other work)
- Line 38: `AUTH_SECRET=...` (value redacted in report)
- Fix: replace with `The AUTH_SECRET env var is required; see .env.example`
- Rotate the secret in production, then `git filter-repo --path CLAUDE.md --invert-paths`
### Drift
- Line 52: claims "Brevo SMTP" — code uses REST (`@getbrevo/brevo` at src/lib/email.ts:4)
- Fix: change to "Brevo REST API via @getbrevo/brevo"
### Duplicates
- "## Email" heading appears at lines 50 and 89
After the report, ask: "Apply the suggested edits for ? [y/N]". Apply only after confirmation. Do NOT apply P0 LEAK fixes automatically — they require secret rotation and git-history scrub the user must drive.
A run passes if every check is true. Otherwise rewrite the offending recommendation.
package.json, pyproject.toml, etc.) — not generic placeholderspackage.json / pyproject.toml / Cargo.toml / Gemfile / README.md to ground suggestions in actual project data..claude/rules/ automatically. Suggest it; the user decides.$TARGET, $HOME, $1. Never /Users/<name>/....Optimized via skillforge optimize (outcome research on AI-native repo / AGENTS.md best practices).
| Skill | When |
|---|---|
/claude-md improve | One file, structural rewrite for best-practice alignment |
/claude-md audit | Many files, hygiene / drift / secret scan |
/init (Anthropic built-in) | Generate a starter CLAUDE.md from current project state |
compound-engineering:ce-compound-refresh | Same spirit, different target — refreshes docs/solutions/ |
Anthropic primary sources:
Community sources (consistent with primary):
Related discipline:
Run the structural eval:
bash tests/eval.sh
To verify behavior end-to-end:
/claude-md improve in any project — verify rubric scorecard cites every R1–R10, every change cites its rule, "no" reply writes nothing, "yes" applies surgically/claude-md audit all — verify P0 LEAK section appears first when leaks present, drift findings cite specific code locations, instruction budget reported per file/claude-md (no args) in a project with CLAUDE.md → improve runs; in ~/ with no local file → audit runs/claude-md audit ./CLAUDE.md → single-file audit (subset of audit, single target)npx claudepluginhub colorprint/my-thinking-skills --plugin ngmeyer-skillsGuides creation, editing, and verification of skills for AI coding agents using test-driven development with subagent scenarios. Use when authoring or debugging skills.