From harness-engineering-skills
Runs cross-LLM iterative code reviews with Codex or Gemini CLI peers, applying accepted fixes until consensus on improved code and report.
How this skill is triggered — by the user, by Claude, or both
Slash command
/harness-engineering-skills:review-loopThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Spawns a peer reviewer (Codex or Gemini) to independently review your code changes.
Spawns a peer reviewer (Codex or Gemini) to independently review your code changes. Claude evaluates findings, implements accepted fixes, and re-submits for peer re-review. Iterates until both agents agree on the final code state.
Key: You (the human) do NOT need to participate. Watch progress via .review-loop/<session>/rounds.json and summary.md.
git CLIcodex CLI (codex --version) or gemini CLI (gemini --version)gh CLI (for PR scope detection)| Setting | Default | Options |
|---|---|---|
peer_reviewer | codex | codex, gemini |
max_rounds | 5 | 1–10 |
timeout_per_round | 600 | seconds |
scope_preference | auto | auto, diff, branch, pr |
The peer reviewer always runs with full local repository access and the loop always implements accepted fixes. There is no read-only mode.
Create .review-loop/config.json in the project root to override defaults:
{
"peer_reviewer": "gemini",
"max_rounds": 8
}
User can specify at invocation time: "review loop with gemini, max 3 rounds". Invocation overrides take highest precedence.
IMPORTANT: Run preflight.sh in a SINGLE bash call. This eliminates ~15 sequential tool calls.
# Locate the skill's scripts directory (works for plugin installs and local clones)
SKILL_DIR=""
for candidate in \
"$(find ~/.claude/plugins/cache -path "*/review-loop/scripts/preflight.sh" 2>/dev/null | head -1 | xargs dirname 2>/dev/null | xargs dirname 2>/dev/null)" \
"$(find ~/.claude/skills -path "*/review-loop/scripts/preflight.sh" 2>/dev/null | head -1 | xargs dirname 2>/dev/null | xargs dirname 2>/dev/null)"; do
[[ -d "$candidate" ]] && SKILL_DIR="$candidate" && break
done
if [[ -z "$SKILL_DIR" ]]; then
echo "Error: review-loop skill directory not found. Ensure the plugin is installed." >&2
exit 1
fi
PREFLIGHT_OUTPUT="$($SKILL_DIR/scripts/preflight.sh \
--peer {peer_reviewer} \
--max-rounds {max_rounds} \
--timeout {timeout_per_round} \
--scope {scope_preference})"
# For a specific commit: add --commit-sha <SHA>
Pass user invocation-time overrides as CLI args — they take highest precedence.
Precedence: built-in defaults < .review-loop/config.json < CLI args
preflight.sh does ALL of the following in one shot:
.review-loop/config.json (merges over defaults)--commit-sha for a specific commit.review-loop/ to .gitignorerounds.jsonThe script outputs key-value pairs. Extract:
SESSION_DIR, SESSION_ID, PEER, SCOPE, BASE_BRANCH, REPO_ROOT, etc.TARGET_FILES_B64_START...TARGET_FILES_B64_END — base64-encoded newline-separated file listPROJECT_B64_START...PROJECT_B64_END — base64-encoded project contextDecode with: echo "$TARGET_FILES_B64" | base64 --decode
If the script exits non-zero, stop and report the error.
Print to user:
Review Loop starting: {scope} ({detail}) → peer: {peer}, max: {max_rounds} rounds
Use Template 1 from prompt-templates.md as a stable prompt contract. Do NOT rewrite the prompt body each run. Only fill the small runtime fields:
repo_rootscope_type / scope_detailtarget_filesproject_descriptionproject_context snippet when neededDo NOT embed the full diff. Do NOT paste large sections of CLAUDE.md or README into the prompt. Round 1 should be a static template plus a lightweight runtime brief so prompt assembly stays cheap and consistent.
Write the completed prompt to:
PROMPT_FILE="$SESSION_DIR/peer-output/round-1-prompt.md"
Determine the path to peer-invoke.sh. It is located relative to the skill's installed directory, NOT the project being reviewed:
# The skill's base directory is provided by Claude Code at invocation time.
# Look for it in the plugin cache or fall back to common install paths.
PEER_SCRIPT=""
for candidate in \
"$SKILL_BASE_DIR/scripts/peer-invoke.sh" \
"$(find ~/.claude/plugins/cache -path "*/review-loop/scripts/peer-invoke.sh" 2>/dev/null | head -1)" \
"$(find ~/.claude/skills -path "*/review-loop/scripts/peer-invoke.sh" 2>/dev/null | head -1)"; do
[[ -x "$candidate" ]] && PEER_SCRIPT="$candidate" && break
done
if [[ -z "$PEER_SCRIPT" ]]; then
echo "Error: peer-invoke.sh not found. Ensure the review-loop skill is properly installed." >&2
exit 1
fi
Note:
$SKILL_BASE_DIRis set by Claude Code from the skill's metadata. The fallback searches the plugin cache and skills directories.
peer-invoke.sh runs Codex in the current repository directory with full access so it can read local files directly. For Codex, it also launches against an isolated temporary CODEX_HOME with no MCP servers, strips inherited CODEX_API_KEY by default, and records the peer session id for reuse in later rounds.
Invoke:
$PEER_SCRIPT \
--peer {peer_reviewer} \
--prompt-file "$PROMPT_FILE" \
--output-file "$SESSION_DIR/peer-output/round-1-raw.txt" \
--session-id-file "$SESSION_DIR/peer-output/peer-session-id.txt" \
--timeout {timeout_per_round}
Read round-1-raw.txt. Parse for:
FINDING: fN blocks → extract into structured findingsNO_FINDINGS: → immediate consensus (skip to Phase 3)Add Round 1 data with all peer_findings. Set claude_actions to empty (not yet evaluated).
For each round N (starting from Round 1's findings):
For each peer finding, apply the evaluation criteria from synthesis-protocol.md:
Record each decision with reasoning in claude_actions.
For each accepted finding:
claude_actions[].code_changesgit add -A && git commit -m "review-loop: changes from round {N}" --allow-empty
The --allow-empty flag ensures rounds where Claude only rejects findings (no code changes) don't fail.
Update the current round's claude_actions with all decisions and changes.
Check if all findings are resolved:
CONSENSUS: → go to Phase 3If round >= max_rounds:
escalatedmax_roundsUse Template 2 from prompt-templates.md:
git diff --name-only HEAD~1 HEADDo NOT paste the diff body into the prompt. Re-review should happen against the current local repository state.
Write to $SESSION_DIR/peer-output/round-{N+1}-prompt.md.
if [[ -f "$SESSION_DIR/peer-output/peer-session-id.txt" ]]; then
PEER_RESUME_ARGS=(--resume-session "$(cat "$SESSION_DIR/peer-output/peer-session-id.txt")")
else
PEER_RESUME_ARGS=()
fi
$PEER_SCRIPT \
--peer {peer_reviewer} \
"${PEER_RESUME_ARGS[@]}" \
--prompt-file "$SESSION_DIR/peer-output/round-{N+1}-prompt.md" \
--output-file "$SESSION_DIR/peer-output/round-{N+1}-raw.txt" \
--session-id-file "$SESSION_DIR/peer-output/peer-session-id.txt" \
--timeout {timeout_per_round}
Reuse the same Codex session for re-review rounds when available. This avoids repeated cold starts, preserves the peer's review context, and materially reduces round-trip latency. Do NOT reuse that session for the final approval pass in Phase 3.
Look for:
CONSENSUS: → all resolved, go to Phase 3ACCEPTED_REJECTION: fN → finding resolved, mark in rounds.jsonINSIST: fN → peer insists, Claude must re-evaluateFINDING: fN → new issues found in Claude's changesFor each INSIST:
Then loop back to Step 2.1 with the updated findings list.
Use Template 3 from prompt-templates.md. Keep the template body fixed and fill only:
repo_rootfinal_target_filesresolution_table_rowsThis is the quality gate. Always run it in a fresh peer session, even if earlier re-review rounds reused the same Codex session.
Write to $SESSION_DIR/peer-output/final-consensus-prompt.md.
$PEER_SCRIPT \
--peer {peer_reviewer} \
--prompt-file "$SESSION_DIR/peer-output/final-consensus-prompt.md" \
--output-file "$SESSION_DIR/peer-output/final-consensus-raw.txt" \
--session-id-file "$SESSION_DIR/peer-output/final-peer-session-id.txt" \
--timeout {timeout_per_round}
Important:
--resume-sessionLook for:
CONSENSUS: → final approval confirmed, continue to report generationFINDING: fN blocks → treat them as real blocking findingsIf the fresh final pass reports new findings:
max_rounds, append the findings as a new round and return to Phase 2.1max_rounds, mark them as escalated and continue with status max_roundsUpdate session metadata:
completed_at: current ISO timestampstatus: consensus or max_roundstotal_rounds: actual countsummary: compute totals from all roundsWrite $SESSION_DIR/summary.md in this format:
# Review Loop Summary
**Session**: {session_id}
**Peer**: {peer_reviewer} CLI
**Scope**: {scope} ({scope_detail})
**Rounds**: {total_rounds} | **Status**: {status_emoji} {status}
## Changes Made
{for each modified file: bullet with file path and description of change}
## Findings Resolution
| # | Finding | Severity | Action | Resolution |
|---|---------|----------|--------|------------|
{for each finding: row with id, title, severity, accept/reject, final status}
## Consensus
{if consensus: "Both Claude Code and {peer} agree the code is in good shape after {N} rounds."}
{if max_rounds: "Review stopped after {N} rounds. {M} items remain unresolved."}
{if escalated items exist:}
## Escalated Items (Needs Human Decision)
{for each escalated finding: peer's argument, Claude's argument, recommendation}
Print a concise summary to the user:
Review Loop complete.
Session: {session_id}
Peer: {peer_reviewer}
Rounds: {total_rounds}
Status: {status}
Accepted: {N} findings fixed
Rejected: {N} (resolved by peer)
Escalated: {N} (needs human decision)
Details: .review-loop/{session_id}/summary.md
ln -sfn "{session_id}" .review-loop/latest
| Error | Action |
|---|---|
| Peer CLI not found | Inform user, suggest installation command, offer alternative peer |
| Peer times out (exit 124) | Log timeout, mark round as failed, ask user whether to retry or stop |
| Peer output unparseable | Save raw output, inform user, attempt to extract any findings manually |
| Max rounds reached | Stop loop, generate report with max_rounds status, list unresolved items |
| Git operations fail | Stop loop, inform user, preserve current state |
| User cancels | Mark session as aborted, generate partial report |
User: "review loop"
→ Detects local diff (5 files changed)
→ Logs startup config and proceeds
→ Round 1: Codex finds 3 issues
→ Claude accepts 2, rejects 1
→ Claude fixes accepted issues, commits
→ Round 2: Codex reviews fixes, accepts rejection reasoning
→ CONSENSUS after 2 rounds
User: "review loop with gemini for PR 42"
→ Scope: PR #42
→ Peer: Gemini (override)
→ Round 1: Gemini finds 5 issues
→ Claude accepts 4, rejects 1
→ Round 2: Gemini insists on rejected finding
→ Claude re-evaluates, accepts
→ Round 3: Gemini confirms all fixes
→ CONSENSUS after 3 rounds
User: "review loop, max 3 rounds"
→ Round 1: Peer finds 8 issues
→ Round 2: 5 resolved, 3 debated
→ Round 3: 2 more resolved, 1 still debated
→ Status: max_rounds, 1 escalated finding
→ Summary shows escalated item for human decision
npx claudepluginhub stone16/harness-engineering-skills --plugin harness-engineering-skillsInvokes a different AI model for independent code review, catching bugs that self-review misses. Useful for PR review or second opinions on code.
Orchestrates multi-agent code review with Codex CLI, Gemini CLI, and five Claude specialist subagents (security, performance, logic, regression, robustness) then synthesizes findings into verified fixes. Use for deep reviews, second opinions, or council reviews on PRs, commits, or branches.
Runs a structured code review using Codex, Claude, or other engines as a closeout check before commit or ship.