From Setup Eval
Full qualitative review of the agent setup. Reads every file, applies per-component rubrics, runs 21 cross-type optimization checks, and produces KEEP/REVIEW/REMOVE verdicts. Use when the user wants a deep review, redundancy check, or quality assessment of their setup.
How this skill is triggered — by the user, by Claude, or both
Slash command
/setup-eval:setup-eval-reviewThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Full qualitative review of the user's agent setup. Claude reads every file and evaluates quality, redundancy, coherence, and optimization opportunities.
Full qualitative review of the user's agent setup. Claude reads every file and evaluates quality, redundancy, coherence, and optimization opportunities.
Before doing anything else, ask the user:
Where should i present the results?
- Terminal - print the report here in the conversation
- File - write a markdown report to a file (you'll choose the path)
Wait for their answer before proceeding.
Determine the setup path. If the user doesn't specify one, use the current working directory.
uv run python skills/setup-eval-lint/scripts/run_assessment.py <setup-path> recommended
Read the JSON output. This gives you per-component diagnostics, token budget, trigger overlaps, and dependency findings.
Do NOT present the lint report separately. Use it as context for the qualitative review.
Read the actual content of every component: SKILL.md files (including reference files in subdirectories), command files, agent files, CLAUDE.md, and settings.json for hooks.
For each component, provide:
For lint failures, use this format:
Lint: 3 failures
FAIL broken-references — 5 referenced files don't exist in this directory (scripts/foo.sh, etc.)
FAIL token-budget — SKILL.md is 915 lines, 3.6x over the 500-line recommendation
FAIL mcp-least-privilege — allowed-tools declares Bash but no script uses shell commands
Use the per-component rubric files for detailed criteria:
rubric/skills-rubric.mdrubric/claude-md-rubric.mdrubric/commands-rubric.mdrubric/agents-rubric.mdrubric/hooks-rubric.mdRead rubric/cross-type-checks.md and answer all 21 checks with YES/NO and a one-line explanation. These check whether components should be transformed (skill to hook?), merged, or removed.
Read report-format.md for the full report structure. The report sections must appear in this order:
At the very end of the report, include the exact timing:
Evaluated with: setup-eval v{version} (claude-code-plugin)
Duration: [X minutes Y seconds]
Get {version} by running: uv run python -c "import importlib.metadata; print(importlib.metadata.version('setup-eval'))"
Compute this from the timestamp of your first tool call in Step 2 to the timestamp when you finish writing the report.
If the user chose terminal: print the report in the conversation.
If the user chose file: write the report as markdown to the path they specified (or suggest setup-eval-review-report.md in the current directory). Tell them the file path when done.
npx claudepluginhub redhat-community-ai-tools/harness-eval-lab --plugin setup-evalProvides CDSS development patterns for drug interaction checking, dose validation, clinical scoring (NEWS2, qSOFA), and alert classification integrated into EMR workflows.