Skill

review-loop

This skill should be used after any review (plan review, implementation review, code review) to apply quality threshold gating. It triggers when findings exceed pass thresholds, when the user says "apply review gate", "re-review after fixes", "check review thresholds", "iterate on review findings", or when a review produces critical or high-severity findings. Enforces mandatory re-review with iteration tracking and escalation.

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/ai-quality-guardrails:review-loop

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

- **No runtime dependencies** — this is a pure instruction/skill package (Markdown + YAML frontmatter)

Supporting Files

agents/openai.yaml

SKILL.md

144 lines · ~1.7k tokens

Stats

LanguageJavaScript

Parent stars0

MaintenanceExcellent

Last CommitApr 2, 2026

Actions

View Source View Plugin View on GitHub View README

Stats

Actions

Prerequisites

No runtime dependencies — this is a pure instruction/skill package (Markdown + YAML frontmatter)
Works with any AI coding assistant that supports skill/instruction injection
For maximum effectiveness, pair with a test runner available in the project (jest, vitest, pytest, go test, etc.)

Review Loop with Quality Threshold

Apply iterative review gating to catch issues that single-pass reviews miss.

The Problem

Single-pass reviews consistently miss issues — multiple review passes with quality thresholds catch what single passes miss. The threshold of 3+ high findings indicates systemic issues rather than isolated oversights.

See the plugin's docs/RESEARCH.md for sourced statistics.

Severity Definitions

All gating depends on consistent severity classification (aligned with the unified severity scale used across spec-workshop and ai-quality-guardrails):

Severity	Criteria	Examples
CRITICAL	Security breach, data loss, corruption, production outage risk	SQL injection, auth bypass, data deletion without confirmation
HIGH	Functional breakage, clear regression, or blocking defect	Wrong return value, missing error handling on critical path, broken API contract
MEDIUM	Reliability, performance, or maintainability risk likely to cause future defects	N+1 query, missing index, unclear error message, brittle test
LOW	Minor quality or documentation inconsistency with limited impact	Naming convention violation, missing comment, unused import
PASS	Verified correct or adequately handled area	—

When in doubt between two severities, choose the higher one if the issue can block safe delivery.

Pre-Gate Check: TDD Compliance (implementation and code reviews only)

For implementation reviews and code reviews of standard+ complexity, check TDD compliance before applying thresholds. Missing TDD evidence is a mandatory block regardless of finding count — classify as high severity.

This check does not apply to plan reviews — at plan stage there is no code or tests yet, only test scenario definitions.

Review Quality Gate

After any review (plan, implementation, or code review):

Count findings by severity
Apply threshold:

Condition	Action
critical > 0 OR high > 3	MANDATORY re-review after fixes (max 3 iterations)
high 1-3	RECOMMENDED re-review — present to user and wait for explicit decision. Default: re-review unless user explicitly waives.
only medium/low	Single pass sufficient

Track iteration in review artifact: remediation_iteration: N
After 3 iterations without resolution → escalate to user for scope decision

Review Type Actions

What "fix and re-review" means depends on the review type:

Review Type	Fix Action	Re-Review Scope
Plan review	Revise plan sections with findings	Re-review revised sections + check for new gaps
Implementation review	Fix code, add tests, update docs	Verify fixes for previous findings + scan for regressions introduced by fixes
Code review	Fix code issues	Verify fixes + scan for regressions

Re-reviews are NOT full re-scans — they verify fixes and check for regressions introduced by those fixes.

Verdict Definitions

Verdict	When to Use
`approved`	Zero critical, zero high findings
`approved_with_conditions`	Zero critical, 1-3 high findings with documented mitigation or user waiver
`blocked`	Any critical finding, OR high > 3, OR mandatory conditions unmet

How It Works

Review produces findings
  ├─ critical > 0 OR high > 3
  │   └─ Fix findings → Re-review (iteration N+1) → Apply gate again
  ├─ high 1-3
  │   └─ Recommend re-review → Present to user → User decides (default: re-review)
  └─ medium/low only
      └─ Single pass sufficient → Proceed

Review Artifact Format

Produce the following artifact for each review iteration:

## Review Report (Iteration N)

### Findings
- [severity] Finding description — file:line — suggested fix

### Summary
- Total: X findings (critical: N, high: N, medium: N, low: N)
- Previous iteration: Y findings
- Delta: -Z findings resolved, +W new findings

### Verdict
- [approved | approved_with_conditions | blocked]
- remediation_iteration: N

Escalation Protocol

When reaching 3 iterations without resolution:

Present remaining findings to user with severity
Offer options: a. Continue fixing (extend iteration limit) b. Accept remaining risk with documented rationale c. Reduce scope to eliminate problematic area
Document decision in review artifact

Capability Warrant Compliance

When a capability warrant block is present in the session context, add these finding categories to the review:

Finding	Severity	Condition
Warrant violation	HIGH (blocking)	Agent relied on a capability not covered by its warrant, or used a capability whose warrant item has `verification_state: stale` for a `required` operation
Warrant policy breach	CRITICAL (blocking)	Agent used a capability with `policy: prohibit` in the warrant
Warrant degradation	MEDIUM	Agent used a capability with `verification_state: declared` for a core operation without noting the fallback

Warrant violations count toward the quality gate threshold alongside other findings.

When no warrant block is present: use existing behavior unchanged.

Related Skills

parallel-review — calls this skill's quality gate after aggregating parallel findings
tdd-enforcement — TDD compliance is a pre-gate check
ai-code-scrutiny — provides the checklist that generates findings
plan-with-ac — upstream: plan reviews are one of the three supported review types; fix action is to revise plan sections
self-review-before-done — shares the iteration-limit pattern (3 internal attempts). The self-review limit is separate from this external review-loop limit — tune both if adjusting retry behavior.

review-loop

Invocation

Context Preview

Supporting Files

SKILL.md

review-loop

Invocation

Context Preview

Supporting Files

SKILL.md

Prerequisites

Review Loop with Quality Threshold

The Problem

Severity Definitions

Pre-Gate Check: TDD Compliance (implementation and code reviews only)

Review Quality Gate

Review Type Actions

Verdict Definitions

How It Works

Review Artifact Format

Escalation Protocol

Capability Warrant Compliance

Related Skills

Similar Skills

Prerequisites

Review Loop with Quality Threshold

The Problem

Severity Definitions

Pre-Gate Check: TDD Compliance (implementation and code reviews only)

Review Quality Gate

Review Type Actions

Verdict Definitions

How It Works

Review Artifact Format

Escalation Protocol

Capability Warrant Compliance

Related Skills

Similar Skills