From review-toolkit
PR-focused final review before merge. Strict PR diff scope via gh, business context analysis, dual-model review (Claude + Codex in parallel), mandatory evidence-based verification of every finding. Outputs MERGE or DO NOT MERGE verdict.
How this skill is triggered — by the user, by Claude, or both
Slash command
/review-toolkit:final-reviewThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
**CRITICAL**: This is a PR review, not a branch review. ALL code changes come from `gh pr diff`. Do NOT use `git diff` against branches.
CRITICAL: This is a PR review, not a branch review. ALL code changes come from gh pr diff. Do NOT use git diff against branches.
Parse $ARGUMENTS for:
gh pr view --json number --jq .number 2>/dev/null--ticket external ticket reference (URL or ID) for cross-referencing requirements beyond the PR descriptionIf no PR number provided and no PR exists for current branch, report error and stop.
Fetch all PR data upfront. Print the PR number and title so the user can verify.
# PR metadata (title, description, labels, linked issues, existing reviews)
gh pr view $PR_NUMBER --json title,body,labels,baseRefName,headRefName,author,url,comments,reviews,closingIssuesReferences
# PR diff — the ONLY source of code changes for this review
gh pr diff $PR_NUMBER
If --ticket is provided, fetch ticket content:
http): use gh issue view for GitHub URLs, WebFetch for external URLs (Jira, Linear, Notion)#123): gh issue view 123Before looking at ANY code, understand WHY this change exists:
--ticket provided, read ticket requirementsFormulate a one-sentence summary of the business problem this PR solves. This frames the entire review — every finding must be evaluated against this context.
Run Codex first (blocking), then Claude's own analysis. Each model analyzes the diff independently. Verification happens AFTER both complete (Step 4).
Check that codex CLI is available:
command -v codex && codex --version
If available, run codex review against the PR's base branch synchronously (NOT in background). Use a 10-minute timeout. The --base flag and [PROMPT] argument are mutually exclusive — do NOT pass both:
codex review --base $BASE_BRANCH --config 'sandbox_mode="danger-full-access"'
Where $BASE_BRANCH is the PR's base ref obtained from gh pr view in Step 1 (e.g., main). The sandbox is fully disabled so Codex can run build tools, tests, and access the network without restrictions.
CRITICAL: Do NOT use run_in_background. The Bash call MUST block until Codex completes. Save the full Codex output before proceeding.
If codex is not installed: warn the user ("codex CLI not found — running Claude-only review. Install it to enable dual-model review.") and continue with Claude analysis alone.
If the codex review command fails: warn and continue with Claude-only review.
After Codex completes (or if Codex is unavailable), perform Claude's independent analysis of the diff. At this stage, produce a RAW list of potential findings — do NOT verify them yet. Just identify concerns.
Get the full diff from gh pr diff. Build an exclude list based on detected project type:
For each changed file, when the diff alone is insufficient to understand correctness:
Examine the code thoroughly for:
CRITICAL: Do NOT proceed to this step until BOTH Claude's raw analysis AND Codex results are available. Since Codex runs synchronously, both should be available at this point.
Merge the raw finding lists from both models, then verify EVERY finding. An unverified claim is worse than no claim.
For each potential finding:
Evidence format for each finding:
**Evidence**: [What was checked, what was found, why this proves the issue]
Example of a GOOD finding:
cert-manager values.yaml:3 sets
enableGatewayAPI: trueunconditionally. Checked cert-manager source (link to docs/code): gateway-shim controller discovers CRDs only at startup. Checked PackageSource dependencies:cozystack.cert-managerhas nodependsOnforcozystack.gateway-api-crds. Therefore on fresh install, cert-manager starts before CRDs exist → gateway support silently non-functional.
Example of a BAD finding (would be rejected):
"enableGatewayAPI is set but Gateway API CRDs might not be installed yet" — no evidence of startup behavior, no dependency check, pure speculation.
After verification:
[Claude + Codex][Codex] tagEvery finding goes into exactly one of two categories. If it does not fit either — do not mention it.
The PR MUST NOT merge with any of these present:
Real problems worth tracking, but they do not block this merge:
No praise. No "great work." No "overall the code looks good." No filler. Findings only.
Start with:
## Verdict: MERGE
or
## Verdict: DO NOT MERGE
Then:
**Business context**: [one sentence — what problem this PR solves]
**PR**: [URL]
**Files changed**: [count] | **Additions**: +N | **Deletions**: -N
If none: ## Blockers\n\nNone.
For each blocker:
## Blockers
### B1: [concise title]
**File**: path/to/file.ext:LINE
**Issue**: [clear description of the bug/vulnerability/problem]
**Impact**: [what happens if this ships as-is]
**Fix**: [specific, actionable suggestion]
If none: ## Action Items\n\nNone.
For each action item:
## Action Items
### A1: [concise title]
**File**: path/to/file.ext:LINE
**Issue**: [description of the problem]
**Ticket suggestion**: [one-line title for a follow-up ticket]
## Summary
[2-3 sentences: scope of what was reviewed, verdict reasoning]
[If dual-model review: note which findings were confirmed by both models, which were model-specific]
[If Codex was unavailable: note "Claude-only review — codex CLI not available"]
<if --ticket> Add after Summary:
## Ticket Compliance
**Ticket**: [title + link/ID]
- [ ] Requirement 1 — done / not done / partially done
- [ ] Requirement 2 — done / not done / partially done
- ...
**Verdict**: All requirements met / Missing: [list]
Missing ticket requirements are blockers (DO NOT MERGE).
Creates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.
npx claudepluginhub lexfrei/ccc --plugin review-toolkit