From qe-framework
Generates virtual user prompts, simulates intent classification, and verifies correct skill/agent routing. Use when auditing skill descriptions or validating routing accuracy.
How this skill is triggered — by the user, by Claude, or both
Slash command
/qe-framework:Mtest-skillThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Verifies that all skill/agent triggers route correctly, identifies misroutes, and suggests fixes.
Verifies that all skill/agent triggers route correctly, identifies misroutes, and suggests fixes.
1. Collect skills → 2. Generate test cases → 3. Simulate routing → 4. Verdict → 5. Suggest fixes → 6. Re-verify
find skills/ -name "SKILL.md" | sort # Skill list
find agents/ -name "*.md" | sort # Agent list
cat hooks/scripts/lib/intent-routes.json # Route config
Extract per skill: name, description, triggers, keywords.
Generate 3 types of virtual prompts per skill:
| Type | Purpose | Example |
|---|---|---|
| A. Normal | Should trigger this skill | "Create a React component" -> Qreact-expert |
| B. Boundary | Distinguish similar skills | "Find root cause of bug" -> Qsystematic-debugging (not Ecode-debugger) |
| C. Unregistered | No route defined | "Create a jira issue" -> Qjira-cli |
Use the same matching algorithm as prompt-check.mjs:
function simulateRouting(userMessage, routes) {
const msgLower = userMessage.toLowerCase();
const msgWords = msgLower.split(/\s+/);
let bestMatch = null, bestScore = 0;
for (const [keywords, target] of Object.entries(routes)) {
const parts = keywords.split('/');
let matchedParts = 0, totalWeight = 0;
for (const part of parts) {
const term = part.toLowerCase().replace(/-/g, ' ');
const termWords = term.split(/\s+/);
const hasExactWord = termWords.some(tw => msgWords.includes(tw));
const hasSubstring = msgLower.includes(term);
if (hasExactWord) { matchedParts++; totalWeight += term.length * 2; }
else if (hasSubstring) { matchedParts++; totalWeight += term.length; }
}
const score = matchedParts > 0 ? matchedParts * 3 + totalWeight : 0;
if (score > bestScore) {
bestScore = score;
bestMatch = { intent: keywords, routed_to: target, score };
}
}
return bestMatch;
}
| Verdict | Meaning |
|---|---|
| PASS | Correctly routed to expected skill |
| MISROUTE | Routed to wrong skill |
| UNREACHABLE | Not in intent-routes.json |
| CONFLICT | Two+ skills tie with same score |
| WEAK | Routes but low score — unstable |
Criteria: score >= threshold*2 = PASS (strong); score >= threshold = PASS + WEAK warning; score < threshold = UNREACHABLE; expected != actual = MISROUTE.
# Skill Routing Test Report
**Run Date:** [date] | **Total:** N | **PASS:** N | **MISROUTE:** N | **UNREACHABLE:** N | **CONFLICT:** N
## MISROUTE (fix required)
| Prompt | Expected | Actual | Score | Root Cause |
## UNREACHABLE (route needed)
| Skill | Suggested Route Keywords |
## CONFLICT (priority adjustment)
| Prompt | Competing Skills | Score |
## WEAK (unstable)
| Prompt | Skill | Score | Risk |
## Coverage
| Category | Registered | Unregistered | Coverage % |
MISROUTE fix: Modify skill description to add differentiating keywords, or update intent-routes.json with more specific routes.
UNREACHABLE fix: Add missing routes to intent-routes.json.
Re-verify: After applying fixes, re-run the same tests to confirm PASS.
| Mode | Command | Scope |
|---|---|---|
| Quick (default) | /Mtest-skill | Registered skills only, 2 prompts each |
| Full | /Mtest-skill --full | All skills/agents, 5 prompts each, full report + fixes |
| Specific | /Mtest-skill Qfoo | Single skill by name |
| Batch | /Mtest-skill --batch 'skills/Q*' | All SKILL.md files matching a glob, cache-aware |
--batch <glob>--batch feeds a filesystem glob (relative to repo root) into scripts/run_mtest_skill.mjs.
The runner collects every SKILL.md below each matched directory, replays the
workflow above (virtual prompts → routing sim → verdict), and prints a markdown
results table to stdout. Optional --out <file> mirrors the table to disk.
# test every Q-prefixed skill
/Mtest-skill --batch 'skills/Q*'
# test M-prefixed skills and save to report.md
/Mtest-skill --batch 'skills/M*' --out .qe/mtest-cache/last-batch.md
Output columns: Skill | Hash | Verdict | Accuracy | Cache | Timestamp.
One row per SKILL.md, sorted alphabetically.
Batch runs are memoised through hooks/scripts/lib/mtest-cache.mjs:
.qe/mtest-cache/{hash}.json (gitignored). One file per distinct
SKILL.md revision ever tested.{verdict, accuracy, timestamp}
and logs Cache: HIT — no virtual-prompt regeneration, no routing sim.Cache: MISS..qe/mtest-cache/.The single-skill mode (/Mtest-skill Qfoo) bypasses the cache so interactive
audits always re-evaluate. Only --batch consults the cache.
MUST DO: Cross-verify intent-routes.json vs actual skill list; test boundary cases between similar skills; confirm no regressions before fixes; get user confirmation before applying.
MUST NOT DO: Modify descriptions without testing; change intent-routes.json without confirmation; change coding-experts descriptions (different trigger mechanism); use real user data.
npx claudepluginhub inho-team/qe-framework --plugin qe-frameworkCreates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.