By Evol-ai
Evaluate SKILL.md files across six quality dimensions, automatically improve them through iterative loops with local validators and LLM guidance, manage versions with rollback and three-way merge, scan for security issues, and track portfolio health — all running locally without external services.
> **Locale**: All templates in this spec are written in English. Detect the user's language from the session and translate user-facing text at display time per SKILL.md's Global UX Rules. Dimension labels: see the canonical table in SKILL.md.
> **Locale**: All templates in this spec are written in English. Detect the user's language from the session and translate user-facing text at display time per SKILL.md's Global UX Rules. Dimension labels: see the canonical table in SKILL.md.
> **Locale**: All templates in this spec are written in English. Detect the user's language from the session and translate user-facing text at display time per SKILL.md's Global UX Rules. Dimension labels: see the canonical table in SKILL.md.
- **Recommended model: Claude Opus 4.6** (`claude-opus-4-6`). Directed improvement requires understanding complex rubric feedback and generating precise, targeted edits. Weaker models may produce unfocused rewrites that fail to address the weakest dimension or introduce regressions in other dimensions.
> **Locale**: All templates in this spec are written in English. Detect the user's language from the session and translate user-facing text at display time per SKILL.md's Global UX Rules. Dimension labels: see the canonical table in SKILL.md.
Modifies files
Hook triggers on file write and edit operations
Runs pre-commands
Contains inline bash commands via ! syntax
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
Bash prerequisite issue
Uses bash pre-commands but Bash not in allowed tools
Bash prerequisite issue
Uses bash pre-commands but Bash not in allowed tools
Evaluate quality. Find the weakest link. Fix it. Prove it worked. Repeat.
GitHub · SKILL.md · Schemas · Changelog
| What it is | A local-first skill quality evaluator and management tool for Claude Code / OpenClaw. Six-dimension scoring, usage-driven suggestions, guided improvement, version tracking. |
| Pain it solves | Turns "tweak and hope" into diagnose → targeted fix → verified improvement. Turns "install and forget" into ongoing visibility over what's working, what's stale, and what's risky. |
| Use in 30 seconds | /skillcompass — see your skill health at a glance. /eval-skill {path} — instant quality report showing exactly what's weakest and what to improve next. |
Evaluate → find weakest link → fix it → prove it worked → next weakness → repeat. Meanwhile, Skill Inbox watches your usage and tells you what needs attention.
|
For
|
Not For
|
Prerequisites: Claude Opus 4.6 / 4.7 (complex reasoning + consistent scoring) · Node.js v18+ (local validators)
npx skills add Evol-ai/SkillCompass
Supports 45+ agents including Claude Code, Codex, Cursor, Cline, Gemini CLI, GitHub Copilot, and more. The CLI auto-detects installed agents and sets up the skill in the right location.
git clone https://github.com/Evol-ai/SkillCompass.git
cd SkillCompass && npm install
# User-level (all projects)
rsync -a --exclude='.git' . ~/.claude/skills/skill-compass/
# Or project-level (current project only)
rsync -a --exclude='.git' . .claude/skills/skill-compass/
First run: SkillCompass auto-triggers a brief onboarding — scans your installed skills (~5 seconds), offers statusLine setup, then hands control back. Claude Code will request permission for
nodecommands; select "Allow always" to avoid repeated prompts.
git clone https://github.com/Evol-ai/SkillCompass.git
cd SkillCompass && npm install
# Follow OpenClaw skill installation docs for your setup
rsync -a --exclude='.git' . <your-openclaw-skills-path>/skill-compass/
If your OpenClaw skills live outside the default scan roots, add them to skills.load.extraDirs in ~/.openclaw/openclaw.json:
{
"skills": {
"load": {
"extraDirs": ["<your-openclaw-skills-path>"]
}
}
}
/skillcompass is the single entry point. Use it with a slash command or just talk naturally — both work:
/skillcompass → see what needs attention
/skillcompass evaluate my-skill → six-dimension quality report
"improve the nano-banana skill" → fix weakest dimension, verify, next
"what skills haven't I used recently?" → usage-based insights
"security scan this skill" → D3 security deep-dive
The score isn't the point — the direction is. You instantly see which dimension is the bottleneck and what to do about it.
Each /eval-improve round follows a closed loop: fix the weakest → re-evaluate → verify improvement → next weakest. No fix is saved unless the re-evaluation confirms it actually helped.
npx claudepluginhub evol-ai/skillcompassUse when reviewing a Claude Code skill, auditing skill quality before publish, or asking "is this skill any good?". Produces structured reports with file:line citations, severity-rated findings cards, and an optional second-opinion pass. Lens variants: full (default), safety, discoverability, architecture, parseability, tests, quick.
Self-evolving skill engine for Claude Code. Creates, scores, repairs, and hardens skills autonomously through recursive improvement cycles.
Validates Claude Code SKILL.md files against ~25 contract rules including structure, triggers, anti-patterns, and naming. Works with Claude Code, Claude Desktop/Web, and Cursor.
Representation Synthesis workflow for auditing agent skills in Claude Code.
Slash-command skill that reviews any SKILL.md against best practices and outputs a structured pass/fail report
Create new skills, improve existing skills, and measure skill performance. Use when users want to create a skill from scratch, update or optimize an existing skill, run evals to test a skill, or benchmark skill performance with variance analysis.