From claude-code-config
Freezes acceptance criteria before building, then uses independent agents to verify each criterion against the plan. Prevents self-verification bias and ensures build-to-spec traceability.
How this skill is triggered — by the user, by Claude, or both
Slash command
/claude-code-config:proof-verifyThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Plan-based verification: freeze acceptance criteria BEFORE building, verify AFTER with independent agents.
Plan-based verification: freeze acceptance criteria BEFORE building, verify AFTER with independent agents.
PHASE 1: PLAN (before any code)
Create .proof/PLAN.md with numbered acceptance criteria
Each AC: testable, specific, has a verification command or check
Plan is FROZEN - no changes during build
PHASE 2: BUILD (normal work)
Implement against the plan
Mark progress in .proof/PROGRESS.md
Builder does NOT self-verify
PHASE 3: VERIFY (after build, independent agent)
Fresh agent reads PLAN.md (never saw the build process)
Walks through each AC, runs verification commands
Writes .proof/VERDICT.md with PASS/FAIL per criterion
If any FAIL → .proof/PROBLEMS.md with specific fixes
PHASE 4: FIX (if needed)
Builder reads PROBLEMS.md, makes minimal fixes
Back to PHASE 3 (re-verify)
Loop until all PASS
Create .proof/PLAN.md in the project root:
# Verification Plan
**Created:** YYYY-MM-DD HH:MM
**Task:** [one-line description]
**Builder:** [session ID or "current"]
**Status:** FROZEN
## Acceptance Criteria
### AC1: [short name]
**Description:** [what must be true]
**Verify:** [exact command or check to run]
**Expected:** [what success looks like]
### AC2: [short name]
**Description:** [what must be true]
**Verify:** [exact command or check to run]
**Expected:** [what success looks like]
### AC3: [short name]
...
## Out of Scope
- [explicitly what this plan does NOT cover]
## Constraints
- [time, resource, or technical constraints]
Rules for good ACs:
Normal implementation. The only additions:
.proof/PROGRESS.md as you work:# Build Progress
### AC1: [name]
- [x] Implemented in `src/foo.py:42`
- Files changed: `src/foo.py`, `tests/test_foo.py`
### AC2: [name]
- [x] Implemented in `src/bar.py:18`
- Files changed: `src/bar.py`
- Note: chose approach B because [reason]
.proof/EVIDENCE.md:# Evidence
### AC1: [name]
**Command:** `pytest tests/test_foo.py -v`
**Output:**
\```
tests/test_foo.py::test_returns_correct PASSED
tests/test_foo.py::test_handles_edge PASSED
\```
**Result:** PASS
### AC2: [name]
**Command:** `grep -c "TODO" src/bar.py`
**Output:** `0`
**Result:** PASS
Builder collects evidence but does NOT write the verdict. That is the verifier's job.
This is the critical phase. The verifier MUST be:
PLAN.md + access to the codebasePROGRESS.md, EVIDENCE.md, or any build contextYou are an independent verifier. Your job is to check whether
the implementation meets the acceptance criteria in .proof/PLAN.md.
Rules:
1. Read .proof/PLAN.md first. This is your ONLY specification.
2. For each AC, run the verification command yourself.
3. Do NOT read .proof/PROGRESS.md or .proof/EVIDENCE.md
(those are the builder's claims - you verify independently).
4. Write your verdict to .proof/VERDICT.md in this format:
# Verification Verdict
**Verifier:** [your session ID]
**Date:** YYYY-MM-DD HH:MM
**Plan hash:** [first 8 chars of md5 of PLAN.md]
## Results
### AC1: [name]
**Status:** PASS | FAIL
**Evidence:** [what you saw when you ran the check]
**Notes:** [any observations]
### AC2: [name]
...
## Summary
- Total: N criteria
- Passed: X
- Failed: Y
- **Overall:** PASS | FAIL
5. If any AC fails, also create .proof/PROBLEMS.md:
# Problems
### AC2: [name]
**Expected:** [from PLAN.md]
**Actual:** [what you found]
**Suggested fix:** [smallest change that would fix it]
**Affected files:** [list]
6. Do NOT fix anything. You are read-only. Report only.
Option A: Subagent (same session)
Agent({
description: "Independent verification against plan",
prompt: "[verifier prompt above]",
mode: "plan" // read-only first
})
Option B: Fresh session (stronger isolation) Write handoff with instruction: "Start by reading .proof/PLAN.md and running verification."
Option C: Multiple verifiers (highest confidence) Spawn 2-3 verifiers independently. If they disagree on any AC, that AC needs investigation.
If VERDICT.md shows any FAIL:
PROBLEMS.mdEVIDENCE.md with new evidence for failed ACsTypical: 1-2 fix rounds. If 3+ rounds on same AC → the AC itself might be wrong. Revisit PLAN.md.
.proof/
PLAN.md # frozen acceptance criteria (Phase 1)
PROGRESS.md # builder's notes (Phase 2)
EVIDENCE.md # builder's evidence (Phase 2)
VERDICT.md # verifier's verdict (Phase 3)
PROBLEMS.md # verifier's findings (Phase 3, if failures)
| Symptom | Cause | Fix |
|---|---|---|
| Verifier passes everything | ACs too vague | Rewrite with specific commands |
| 3+ fix rounds on same AC | AC is wrong or untestable | Revisit PLAN.md |
| Verifier disagrees with builder's evidence | Different env or stale state | Both run from clean state |
| Builder keeps editing PLAN.md | Not frozen | Hash check catches this |
npx claudepluginhub anastasiyaw/claude-code-configVerifies implementation against a spec with evidence-based checks and three independent self-consistency passes. Ensures every requirement is backed by verbatim evidence before merge.
Verifies feature completion by requiring automated tests that prove functionality works. Enforces phase gates and spec alignment before acceptance.
Runs a structured verification pass checking behavior, edge cases, quality bars, and documentation alignment. Use after implementing features or risky fixes.