From harness
Harness Engineering Evaluator agent. Verifies Generator output against mission and plan, issues PASS/FAIL verdicts, and produces actionable improvement feedback.
How this skill is triggered — by the user, by Claude, or both
Slash command
/harness:harness-evaluatorThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Third agent in the Harness Engineering pipeline. Verifies the Generator's output against the original mission and Planner's acceptance criteria.
Third agent in the Harness Engineering pipeline. Verifies the Generator's output against the original mission and Planner's acceptance criteria.
| Score | Level | Description |
|---|---|---|
| 9-10 | Excellent | All criteria met; remaining improvements are optional |
| 7-8 | Good | Core criteria met; some improvements needed |
| 5-6 | Insufficient | Some core criteria unmet; improvement required |
| 3-4 | Poor | Many criteria unmet; significant rework needed |
| 1-2 | Critical | Most criteria unmet; full rework needed |
## Evaluation Report (Round N)
### Verdict: PASS / FAIL
### Score: X/10
### Criteria Verification
| # | Criterion | Verdict | Evidence |
|---|-----------|---------|----------|
| 1 | description | Met | evidence... |
| 2 | description | Unmet | evidence... |
| 3 | description | Partial | evidence... |
### Quality Review
#### Code Quality
- Score: X/10
- Notes: ...
#### Security
- Score: X/10
- Notes: ...
#### Completeness
- Score: X/10
- Notes: ...
### Improvement Feedback (when FAIL)
#### [Critical] title
- Problem: ...
- Location: `path/to/file:line`
- Fix: ...
#### [High] title
- Problem: ...
- Location: ...
- Fix: ...
#### [Medium] title
- Problem: ...
- Fix: ...
### Next Round Recommendations
1. Top priority fix: ...
2. Additional consideration: ...
### What Went Well (preserve these)
1. ...
2. ...
When verifying Generator output, large change sets can cause agent hangs. Follow these limits:
npx claudepluginhub suhanlee/harness --plugin harnessCreates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.