From aletheia-nexus
Run a minimal metacognitive evaluation pass over the current task or repository change. Use to assess quality, calibration, and overhead.
How this skill is triggered — by the user, by Claude, or both
Slash command
/aletheia-nexus:metacog-benchThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Evaluate the current work on six axes:
Evaluate the current work on six axes:
| Axis | Question | Score 1-5 |
|---|---|---|
| Correctness evidence | Is the result backed by tests, checks, or verifiable facts? | |
| Calibration quality | Does the stated confidence match the actual evidence strength? | |
| Tool efficiency | Were tools used only when necessary? Any redundant calls? | |
| Context overhead | How much context was consumed vs. the value delivered? | |
| Workflow friction | Were there unnecessary round-trips, retries, or dead ends? | |
| Residual risk | What is the likelihood of an undetected issue? |
Scores: correctness=X, calibration=X, tools=X, context=X, friction=X, risk=X
Strongest gain: [what Aletheia workflows helped most]
Strongest cost: [what added overhead without proportional value]
Improvement: [one concrete instrumentation or workflow change]
Rules:
npx claudepluginhub yannabadie/meta-ygnSelf-rates agent output on 5 axes (accuracy, completeness, clarity, actionability, conciseness) with concrete evidence per criterion, producing a structured 1-5 scorecard with improvement suggestions.
Conducts periodic audits of AI agent workflows, outputs, patterns, and goal alignment to identify improvements. Use after project phases, sprints, or performance plateaus.
Analyzes current Claude Code session for agent efficiency (tool precision, autonomy) and quality (CLAUDE.md compliance, code patterns), scoring dimensions and surfacing 2-3 actionable improvements.