Skill

improve-skill

Use when revising an existing skill from eval feedback, trigger misses, output regressions, bloat, overfitting, or repeated manual work across skill runs.

Popularity

Parent stars

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/ai-assistant-ops:improve-skill

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Improve an existing skill through measured iterations: test, compare, collect

Supporting Files

agents/openai.yamlevals/evals.json

SKILL.md

74 lines · ~850 tokens

Stats

LanguagePython

Parent stars1

MaintenanceExcellent

Last CommitJun 11, 2026

Actions

View Source View Plugin View on GitHub View README

Stats

Actions

Improve Skill

Improve an existing skill through measured iterations: test, compare, collect feedback, revise, rerun, repeat, then tune triggering.

Start Point

Start from the user's target skill and current stage: test prompts, baseline runs, feedback review, rewriting, rerun comparison, or trigger checks. If the target skill is unclear, resolve it before editing.

Require test prompts before editing. If no evals exist, help create a small set of realistic prompts first. Use objective assertions only for observable behavior; keep subjective quality checks qualitative.

Iteration Loop

Snapshot the existing target skill before changes as the old-skill baseline.
Run the old-skill baseline and the candidate improved skill against the same prompts; keep inputs, target files, and expected outputs identical.
Capture outputs, transcripts, qualitative feedback, objective assertions when useful, timing and token data when available, and iteration history.
Compare the improved run against the old-skill baseline or previous iteration, whichever best answers the user's current question.
Revise from feedback by generalizing the lesson. Do not overfit wording to one prompt, hide behavior behind rigid commands, or add instructions that do not earn their tokens.
Keep the prompt lean. Remove unused guidance, repeated wording, and unproductive transcript-driven detours.
Explain why important instructions matter so the next assistant can adapt rather than follow brittle rules by rote.
Look for repeated manual work across test cases. If multiple runs recreate a helper, checklist, schema, or conversion step, add a reusable scripts/ tool or resources file and point the skill to it.
After rewriting, invoke ai-assistant-ops:md-bloat-hunter on the changed skill context before evaluation or rerun. Preserve tested behavior and trigger coverage while removing redundancy and filler.
Rerun all test cases into the next iteration and compare against the previous iteration.

Repeat until the user is satisfied, feedback is empty, or progress stalls. When progress stalls, report the pattern and ask whether to change the eval set, accept the current tradeoff, or stop.

Evidence To Keep

Keep enough evidence to explain each iteration: target and snapshot paths, prompt set, old-skill and improved outputs, transcript notes, feedback, assertions, timing or token data, and what changed with why.

Trigger Optimization

After the body stabilizes, test the description with should-trigger and should-not-trigger prompts, including near misses. Revise it only for missed triggers or false positives.

Keep the description trigger-only: symptoms and situations, not workflow. The body owns the process.

Common Mistakes

Mistake	Correction
Editing before evals exist	Create prompts first.
Comparing different prompt sets	Use the same prompts.
Treating one complaint as a narrow patch	Generalize the feedback.
Adding more prose for every failure	Prefer lean wording, examples, scripts, or resources.
Stopping after body edits	Run trigger checks after stabilization.

improve-skill

Popularity

Invocation

Context Preview

Supporting Files

SKILL.md

improve-skill

Popularity

Invocation

Context Preview

Supporting Files

SKILL.md

Improve Skill

Start Point

Iteration Loop

Evidence To Keep

Trigger Optimization

Common Mistakes

Similar Skills

Improve Skill

Start Point

Iteration Loop

Evidence To Keep

Trigger Optimization

Common Mistakes

Similar Skills