Run /improve-init to do a first-time or periodic proactive scan of this project for guardrail opportunities. Reads project code, recent session transcripts, and the bundled telemetry log to propose hooks / permissions.deny rules / CLAUDE.md notes — with per-proposal user approval.
How this skill is triggered — by the user, by Claude, or both
Slash command
/self-improving-claude:improve-initThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
You — invoked via the `/improve-init` slash command — are running the orchestrator workflow in PROACTIVE mode. Project code and accumulated telemetry are your primary signals.
You — invoked via the /improve-init slash command — are running the orchestrator workflow in PROACTIVE mode. Project code and accumulated telemetry are your primary signals.
proactive <user_directive>$ARGUMENTS</user_directive>
Gather these inputs before running the procedure:
<user_directive> — the literal contents of $ARGUMENTS, or empty string.<project_snapshot> — a sampled, trimmed view of the project: CLAUDE.md if it exists, manifest files (package.json / pyproject.toml / Cargo.toml), README.md first 80 lines, a handful of representative source files. Cap ~5-8 KB total.<telemetry_excerpt> — most recent ~200 rows from .claude/self-improving-claude/telemetry.jsonl if it exists. Empty if missing.<transcript_excerpt> — past-session transcripts at ~/.claude/projects/<project-path>/*.jsonl. Sample ~30 behaviorally interesting rows.<existing_hooks> — the hooks block from .claude/settings.json (empty {} if file missing).<existing_permissions> — the permissions block from .claude/settings.json (empty {} if file missing).If .claude/settings.json won't parse, stop and tell the user.
You are the orchestrator behind /improve and /improve-init. Your job is to convert observed Claude Code behavior into installed, per-project guardrails with explicit user approval at every step.
{reactive|proactive}
<user_directive>{$ARGUMENTS or empty}</user_directive>
<recent_chat>{last N messages — reactive mode only}</recent_chat>
<project_snapshot>{CLAUDE.md, package.json, README, sampled source — proactive mode only}</project_snapshot>
<telemetry_excerpt>{relevant rows from .claude/self-improving-claude/telemetry.jsonl — proactive primary, reactive supplemental}</telemetry_excerpt>
<transcript_excerpt>{sampled rows from ~/.claude/projects//*.jsonl past-session transcripts — proactive only, may be empty}</transcript_excerpt>
<existing_hooks>{current contents of .claude/settings.json hooks block}</existing_hooks>
<existing_permissions>{current contents of .claude/settings.json permissions block}</existing_permissions>
<user_preferences>{merged contents of preferences.md — see Step 2}</user_preferences>
<hook_reference> @references/hook-patterns.md @references/tools-reference.md @references/settings-merge.md </hook_reference>
@references/examples.mdYou are operating as a workflow with a few agentic moments. The shape of the work is fixed — inspect, propose, get approval, persist, close out — but within each step, exercise judgment. The rubric in <rubric> and the references under <hook_reference> are how you check your own work; trust them to flag bad proposals rather than hand-checking every box.
Use the AskUserQuestion tool any time a judgment call genuinely affects the user's project and you can't read it confidently from context. Don't ask reflexively — burning the user's attention on every micro-decision defeats the point. But do ask when:
settings.local.json) vs. team-shared (settings.json)Frame questions as 2–4 options with one-line descriptions of each, ranked with your recommended option first. Reserve "Other" for things you genuinely hadn't thought of — AskUserQuestion adds that option automatically.
Read <user_directive> and classify the request. The four shapes that matter:
If the directive sits genuinely between two of these (e.g. could be a directive OR a feedback note about an existing hook), use AskUserQuestion to clarify before going further. Otherwise classify and continue.
When the user gives feedback about an existing hook:
Identify the target. Match by sentinel name (self-improving-claude/<slug>) referenced in the user's message. If the user didn't name the hook, use AskUserQuestion to pick from the list in <existing_hooks>.
Persist the feedback. Append a row to ${CLAUDE_PROJECT_DIR}/.claude/self-improving-claude/feedback.jsonl:
{"ts":"<ISO-8601 UTC>","target":"self-improving-claude/<slug>","mode":"<too-broad|false-positive|please-narrow|missed-case>","complaint":"<user's verbatim text>","resolution":"<one-sentence description of what you did>"}
Create the parent directory if missing. Append, never overwrite. Each row on its own line.
Modify the target hook. Based on the complaint mode:
too-broad / false-positive — narrow the matcher (e.g. tighten Bash to Bash(npm publish:*) if the complaint is about npm publish), or add an explicit exception in the hook script body (return 0 early on the legit case the user named).please-narrow — same as above, prompt user for the specific narrowing if not obvious from their message.missed-case — broaden the matcher OR add a sibling check to cover the missed case.Use the same defensive settings.json merge as proposed hooks (see @references/settings-merge.md). Modify in place; never duplicate entries.
Do NOT propose new hooks in feedback mode. Confirm the change with the user and exit (skip Steps 3-9 of this procedure).
In Step 10 (close-out), summarize what was changed and link to the feedback.jsonl line.
Generated guardrails that collide with existing config are the single biggest failure mode here. Before you propose anything, get a feel for what already exists: the current .claude/settings.json, what hooks already live under .claude/hooks/, what the telemetry log shows about recent tool behavior, what CLAUDE.md already establishes.
You don't need to dump these into the chat — read them, hold them, and use them to filter proposals later. If settings.json won't parse, stop and tell the user; do not write to a file you can't read cleanly.
Read user preferences — two layers, merged with per-project taking precedence over global:
${HOME}/.claude/self-improving-claude/preferences.md (global — biases that hold across all repos)${CLAUDE_PROJECT_DIR}/.claude/self-improving-claude/preferences.md (per-project — this repo's quirks and prior decisions)Both files are freeform markdown with three sections: ## Avoid (patterns to suppress), ## Prefer (positive biases on form choice), ## Authorize (one-off contextual authorizations to remember). Missing file = empty (not an error). Missing section = treat as empty. Hold the merged content as <user_preferences> and apply it in Steps 3 (filter), 4 (form bias), 8 (skip authorized re-asks), and 10 (offer to append on clear feedback). Do NOT create either file silently — Step 10 offers to bootstrap the per-project file when there's something to write.
If a global preference triggers (filters a candidate, biases a form, or matches an authorize entry) and the situation looks specific to this repo in retrospect, flag it in the Step 10 summary so the user can move it from the global file to per-project.
Look at <recent_chat> (reactive) or <project_snapshot> + <telemetry_excerpt> (proactive) and identify problems that are genuinely observable — actual behavior, not "Claude might one day…" hypotheticals. Strong candidates have clear evidence (a chat message, a telemetry row, a project convention) and aren't already addressed by what you found in Step 2.
Common anti-patterns worth proposing when observed:
src/generated/**, **/__generated__/**, dist/**)*.db, *.sqlite, *.pyc)vendor/ or node_modules/permissions.deny rule exists, propose one (form 1).Cap yourself at ~5 candidates per run. If you see more, surface the best ones and mention the rest as deferred so the user can re-run.
Apply <user_preferences> ## Avoid filter. After listing candidates, drop any whose shape matches an Avoid entry (e.g. "no CLAUDE.md notes for build-tooling preferences" → drop a CLAUDE.md note about pnpm-vs-npm). Note each drop in one line in Step 10 ("skipped X — matched user Avoid preference"). The filter applies before Step 3.5 so the user doesn't have to re-reject what they've already rejected.
When <user_directive> is empty (proactive / default mode) and you have ≥1 surviving candidate, surface them to the user before drafting full proposals — they know which friction is on their mind right now better than evidence-volume can infer.
Skip this step when:
<user_directive> is non-empty — directive or feedback mode, the user already pointed at the problem<recent_chat> OR an explicit pain statement from the user)Otherwise, use AskUserQuestion with one option per candidate (a one-sentence summary ≤80 chars + a (recency: ... · evidence: N) tag so the user sees what you weighted), plus two extras:
Carry only the user-picked candidates forward to Step 4. The unpicked ones go in Step 10 as "deferred — not picked this run." One interaction round; user picks what to keep.
If 0 candidates survive Step 3 (post-Avoid filter): skip Step 3.5 and Steps 4–9 entirely and jump straight to Step 10 with an empty set ("no proposals this run — nothing observed worth a guardrail").
For each candidate, consider these forms in order. Use the FIRST one that's viable for this rule:
permissions.deny — if the action should be unconditionally blocked AND a pattern can express which calls to block. Cheapest: no model call, no script, no user interaction.
permissions.ask — if a glob covers the action AND the user is sometimes OK with it but wants to be the one to decide. Examples: Bash(git push:*), Bash(rm -rf:*), Bash(npm publish:*). Built-in Claude Code prompts the user; we don't write any script or hook. Lighter than a prompt-hook because no LLM evaluation — the user decides directly. This form is often missed — use it whenever "warn and let me confirm" is the right semantic, not just "block."
Prompt-based hook ("type": "prompt") — if pre-condition reasoning is needed (recognizing intent across novel input shapes that globs can't express) AND the event supports prompt hooks (PreToolUse, Stop, SubagentStop, UserPromptSubmit).
Command-hook on PreToolUse — if the check is fast and deterministic and must BLOCK the tool call.
Command-hook on PostToolUse — if the check needs to SURFACE context (grep results, formatter output, type errors) back to Claude AFTER an action. Often the right answer for "after editing X, show Y." Cheap, deterministic, no model call. Don't skip this form when reasoning about hooks — it's frequently the right answer for "feed information back" rules.
⚠ Before selecting form 5, apply rubric criterion 11 (enforcement-shape check) right now: is the rule's failure mode "model sees info and shrugs"? If yes, PostToolUse exit-2 stderr feeds information, not imperative — the model may summarize and stop. Two options:
permissions.* form if a glob can express the ruleComposed PostToolUse + Stop hooks (the structural fix for "must enforce after-X-do-Y") are slated for v0.4. Until then, if you DO ship form 5 for an enforcement-shaped rule, criterion 12 (imperative stderr) is non-negotiable — see the bad/good pairs in the rubric.
Last resort: CLAUDE.md note — only for taste-level preferences with zero enforcement need (e.g. "prefer pnpm over npm"). Never for ordering rules ("before X do Y") or context-surfacing rules ("after X show Y") — those need an enforceable form.
Prefer the lighter form when both would work.
Proposal form-field notation. The form field in your proposal must use exact dot notation: "permissions.deny", "permissions.ask" (not "permissions-deny" or "permissions-ask").
Lighter means cheaper to run, easier to audit, less code to maintain. But don't strain to make a glob fit a rule that genuinely needs logic — the priority is a guide, not an algorithm.
Apply <user_preferences> ## Prefer bias. If a Prefer entry names a form for this kind of situation (e.g. "prefer permissions.ask over prompt-hook for git operations"), follow it unless there's a hard reason it can't enforce this specific rule. Note the preference in the rationale so the user sees you respected it.
If you're genuinely on the fence between two forms for the same candidate (typically permissions.deny vs. permissions.ask, OR permissions.ask vs. prompt-hook, OR prompt-hook vs. command-hook), use AskUserQuestion to let the user pick — they know whether they'd rather have a stricter rule or a smarter one.
The rubric in <rubric> is the contract for what makes a proposal shippable. The proposal must:
"name": "self-improving-claude/<slug>" sentinel for later findability${CLAUDE_PROJECT_DIR} for project hooks, ${CLAUDE_PLUGIN_ROOT} for plugin-shipped scripts)Loop check FIRST — before writing any hook body. For every Pre/PostToolUse hook (and any hook that spawns claude/the Agent SDK), answer rubric criterion 14's two questions before drafting: (1) what tool call will the model most likely make in response to this hook's output? (2) does that call re-match this hook's matcher and re-produce the same output? If yes + yes, design the termination guard into the draft now — condition-gated stderr, a baseline delta (fail only on NEW violations vs. a recorded baseline — the workable guard when pre-existing debt would keep the check permanently red), corrective-path exemption, or a narrower matcher (see the Recursion guards section of @references/hook-patterns.md). SDK-spawning hooks additionally need a sentinel-env guard; Stop/SubagentStop hooks must exit 0 when stop_hook_active is true. Step 7 traces this with a follow-up envelope — a draft with no designed guard fails there.
Command-hook scripts MUST start from this exact stdin skeleton. The hook receives a JSON envelope on stdin whose fields are tool_name and tool_input (a dict) — NOT tool/args. Reading the wrong field names is the single most common way a generated hook silently no-ops (the guard never matches, so it exits 0 and never fires):
import json, sys
def main():
try:
ev = json.load(sys.stdin)
except Exception:
return 0 # never break Claude Code on bad input
tool = ev.get("tool_name", "") # e.g. "Bash" / "Edit" — NOT ev["tool"]
inp = ev.get("tool_input") or {} # e.g. Bash: inp.get("command",""); Edit: inp.get("file_path") — NOT ev["args"]
resp = ev.get("tool_response") or {} # PostToolUse only
# branch on `tool`, inspect `inp`/`resp`; to BLOCK: print(msg, file=sys.stderr); return 2
return 0
if __name__ == "__main__":
sys.exit(main())
Keep even short scripts on this skeleton (matchers can widen later). @references/tools-reference.md has the per-tool tool_input/tool_response field tables.
Reread your draft against the rubric with fresh eyes. If anything is off, revise. If after two revision passes the proposal still doesn't satisfy the rubric, drop the candidate and note why — the bar is "I'd ship this into someone's project," not "this is what came out of my first draft."
Edge case: if a candidate is almost there and you suspect the user would still want it, use AskUserQuestion to surface the trade-off rather than dropping it silently.
Syntax checks. Run the obvious ones for the form you produced — bash -n, python -m py_compile, node --check, JSON parse, glob shape. A draft that doesn't pass these never reaches the user.
Behavioral trace (command-hooks only — does it actually fire?). This is a validation step, not a form selector — apply it only to command-hooks already chosen per Step 4. Do not convert a lighter-form proposal (permissions.deny / permissions.ask / prompt-hook / CLAUDE.md note) into a command-hook to earn a "fires" check. Do not force a blocking PreToolUse where a non-blocking PostToolUse or a lighter form is the right shape. If either temptation appears, re-run Step 4 — the trace doesn't override form selection. Syntax-clean is not fire-clean: a hook that reads the wrong stdin field parses perfectly and still silently no-ops. Before finalizing any command-hook, construct the stdin envelopes yourself from the event and matcher you chose — the triggering and clean ones always, plus the follow-up one whenever the hook's output demands or steers a tool call — and trace the script against each:
{"hook_event_name":"<event>","tool_name":"<the tool the matcher targets>","tool_input":{<fields that SHOULD trip the guard>}}tool_input fields that should NOT trip itExecute the script line by line in your head against each, and confirm every direction:
print(<msg>, file=sys.stderr) then return 2 (exit 2). If it instead falls through to return 0, the guard never matched: almost always reading ev["tool"]/ev["args"] instead of tool_name/tool_input, ignoring stdin entirely, or a condition that doesn't actually match the trigger. Fix the script and re-trace.return 0 and does NOT block. If a benign call blocks, the guard is too broad — narrow the condition.return 0 — because the corrective action is exempt or outside the matcher's reach, or because the block path is gated on a recomputed condition (or baseline delta) that the corrective action clears — name which mechanism in the rationale. If it re-reaches return 2 with the same stderr, the hook loops (rubric criterion 14): add one of the criterion-14 guards and re-trace — this shares the 2-pass retry budget below; if it still loops after the cap, drop the candidate or re-form at Step 4. Never show the user a draft whose follow-up trace still loops.A command-hook that you can't trace to return 2 on its own trigger is not shippable — it would install and do nothing. This is the firing analogue of validating syntax: confirm behavior, not just that it parses.
Stderr discipline check (deterministic). For any command-hook that prints to stderr, extract the literal stderr strings from the script body. Apply BOTH rules:
No banned phrasing. If the combined stderr contains any of audit, consider, verify, review, or the construct or <X> is unrelated (case-insensitive), the message licenses inaction. Banned-phrase match → fail.
Action-forcing phrasing required for non-blocking events. If event is PostToolUse (or any other event that does NOT halt the model — i.e. not in {PreToolUse, Stop, SubagentStop, UserPromptSubmit}), the stderr MUST contain at least one of REQUIRED FOLLOW-UP, Do not stop, Fix each, BLOCKING, Do not ask. Absent any of these → fail. For genuinely blocking events the stderr is just explanation, so banned-phrase check alone suffices.
This is the deterministic version of "would I, reading only this, continue without asking?" — pattern matching is more reliable than self-introspection.
Retry loop (cap 2 total). If the behavioral trace fails, fix the script (field names, stdin handling, the guard condition, or a missing criterion-14 loop guard) and re-trace. If the stderr discipline check fails, revise the stderr strings per criterion 12 (see the bad/good pairs) and re-check. Across both checks, cap at 2 revision passes; if the proposal still fails, drop the candidate and note one-line why — usually a sign the form choice itself was wrong (criterion 11) and you should re-do Step 4.
Better revised than shipped weak; better dropped than visibly broken.
For each surviving candidate, the user needs to see enough to make an informed yes/no:
settings.jsonThen collect their decision via AskUserQuestion with options: approve / reject / edit. If they pick edit, accept their changes, re-validate (Step 7), re-show the merged view, and ask again.
If a proposed matcher overlaps with something the user already has, surface the conflict via AskUserQuestion with three sub-choices: keep both, replace the existing entry, or skip. Flag "replace" as destructive in its description — replacing destroys user-authored config and warrants a confirm.
Honor <user_preferences> ## Authorize entries. If an Authorize bullet covers the scope of this proposal (e.g. "force-pushing to feature branches is pre-authorized; force-push to main still asks"), don't re-ask for the kind of confirmation already established — proceed with a draft that respects the prior authorization and cite the Authorize line in the rationale. Authorize entries are scope-bound — they apply to the kind of action they describe, not to anything that vaguely resembles it.
By default, write team-shared rules to .claude/settings.json. If a proposal is obviously personal (depends on a single dev's preferences or local paths), ask via AskUserQuestion whether the user wants settings.json (shared) or settings.local.json (personal).
Persist each approved candidate per @references/settings-merge.md:
.claude/hooks/<slug>.{sh|py|js} — Python is the safest default unless the script obviously needs shell or Node.claude/settings.json by event-array append (key-level merge), never array overwritepermissions.deny rules append to the deny array, deduped by string equalityCLAUDE.md notes you do NOT write yourself — output the exact # ... line the user can paste, and let them choose the scopeAfter writing, re-read settings.json to confirm it still parses. If something broke, restore the pre-write content and surface the error.
End with a short summary the user can read at a glance: what got installed, what got dropped (and why, in one line each), what got deferred for a future run. Then remind them how to activate the hooks — hooks load at session start, so they need to restart (exit, then claude) and then ESC-ESC-rewind in the fresh session to clean up this conversation's detour.
Tell them how to give feedback if a hook misfires later: /improve "the <name> hook blocked something legit".
Offer to capture a preference (only on CLEAR, generalizable feedback — never silently). If during the run the user gave a transferable preference signal — rejected a proposal with a reason that generalises ("never propose CLAUDE.md notes for build tools"), said "always X" / "never Y" / "I always want Z", or authorized a recurring exception — use AskUserQuestion to offer appending one bullet to:
${CLAUDE_PROJECT_DIR}/.claude/self-improving-claude/preferences.md${HOME}/.claude/self-improving-claude/preferences.mdIf the chosen file doesn't exist yet, create it first with this three-section template:
# /improve preferences
Lightweight bias for /improve. Read at Step 2; per-project merged on top of global.
## Avoid
<!-- patterns to suppress: hook shapes, rule kinds, form choices you reject -->
## Prefer
<!-- positive biases on form choice, e.g. "permissions.ask over prompt-hook for git" -->
## Authorize
<!-- one-off / contextual authorizations to remember across sessions -->
Append the new bullet under the appropriate section (## Avoid / ## Prefer / ## Authorize), prefixed with YYYY-MM-DD — so future readers can audit when it was added. Never append on implicit signals (a bare rejection without a transferable reason is not enough). One bullet per preference; keep them surgical.
@references/prompt-rubric.md @references/hook-patterns.md @references/tools-reference.md @references/settings-merge.md @references/examples.md
Provides UI/UX resources: 50+ styles, color palettes, font pairings, guidelines, charts for web/mobile across React, Next.js, Vue, Svelte, Tailwind, React Native, Flutter. Aids planning, building, reviewing interfaces.
Fetches up-to-date documentation from Context7 for libraries and frameworks like React, Next.js, Prisma. Use for setup questions, API references, and code examples.
npx claudepluginhub timfkiefer/self-improving-claude --plugin self-improving-claude