From codex-plan-reviewer
Two-pass adversarial review of design documents and implementation plans using OpenAI Codex CLI. Invokes Codex to review plans section-by-section (pass 1), then holistically (pass 2), feeding critique back for revision. Use when you have a design doc, architecture plan, or implementation plan that should be stress-tested before execution.
How this skill is triggered — by the user, by Claude, or both
Slash command
/codex-plan-reviewer:codex-reviewThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Use this skill to get an independent adversarial review of design documents and implementation plans by invoking OpenAI Codex CLI as a reviewer. The review happens in two passes:
Use this skill to get an independent adversarial review of design documents and implementation plans by invoking OpenAI Codex CLI as a reviewer. The review happens in two passes:
After each pass, Claude integrates the feedback and revises the plan before proceeding.
codex CLI installed and authenticated (npm i -g @openai/codex)OPENAI_API_KEY / CODEX_API_KEY set)All scripts are Python for cross-platform compatibility (Windows, macOS, Linux).
Before invoking any review, ensure:
codex is available: which codexSplit the document into reviewable sections. Use the scripts/extract-sections.sh script:
python3 /path/to/codex-review/scripts/extract_sections.py <plan-file>
This creates a .codex-review/sections/ directory with individual section files. Each file contains the section content plus minimal surrounding context (the document title and table of contents if present) so Codex can understand where the section fits.
For each section, invoke Codex with the section review prompt:
python3 /path/to/codex-review/scripts/review_section.py <section-file> [review-type]
Where review-type is one of:
architecture (default) — focuses on structural soundness, component boundaries, data flowimplementation — focuses on feasibility, ordering, dependencies, edge casesapi — focuses on interface contracts, backwards compatibility, error handlingdata — focuses on data models, migrations, consistency, performanceEach section review produces a structured feedback file in .codex-review/feedback/pass1/.
Review the pass 1 feedback with the user. Present a summary of findings per section, categorized by severity:
Revise the plan based on pass 1 feedback before proceeding to pass 2. This is important — pass 2 should review the improved plan, not the original.
After revisions, invoke the holistic review on the full (revised) document:
python3 /path/to/codex-review/scripts/review_holistic.py <plan-file> <pass1-feedback-dir>
This pass specifically looks for:
The holistic review also receives the pass 1 feedback so it can verify that earlier issues were actually addressed.
Output goes to .codex-review/feedback/pass2/holistic-review.md.
Present the pass 2 findings to the user. Apply final revisions. The complete review trail is preserved in .codex-review/ for reference.
The skill respects these environment variables:
| Variable | Default | Description |
|---|---|---|
CODEX_REVIEW_MODEL | (codex default) | Override the Codex model for reviews |
CODEX_REVIEW_TIMEOUT | 120 | Timeout in seconds per review invocation |
CODEX_REVIEW_VERBOSE | 0 | Set to 1 to show Codex stderr output |
implementation review type is best for plans that will be fed to execute-plan[ACKNOWLEDGED] — the holistic pass will see this and won't re-flag it.codex-review/ directory around — it's useful for understanding why decisions were made laterIf pass 1 reveals major issues in a specific section, use the iteration script to do focused multi-round review using Codex session resume:
python3 /path/to/codex-review/scripts/iterate_section.py <section-file> <revised-section-file> [review-type] [max-rounds]
How it works:
codex exec resume --last, passing the revised content. Codex evaluates whether its previous concerns were addressed, marks findings as RESOLVED or UNRESOLVED, and flags any new issues introduced by the revision.max-rounds (default 3) is reached.Interactive mode (default): The script pauses between rounds and waits for ENTER after editing the revised file. Press q to stop early.
Non-interactive mode (--no-interactive): Runs all rounds without pausing. Useful when Claude Code manages the edit-review loop externally.
Single-round mode (--round N): Runs only round N. This is the best option when Claude Code drives the loop — it can run round 1, read feedback, revise the file, then run round 2, etc.
Example Claude Code workflow:
# Round 1
python3 scripts/iterate_section.py sections/03-data-model.md revised.md data --round 1
# Claude reads feedback, revises revised.md
# Round 2
python3 scripts/iterate_section.py sections/03-data-model.md revised.md data --round 2
Convergence detection: The script tracks issue counts across rounds. If issues stop decreasing after 3+ rounds, it warns that manual review may be needed.
Fallback: If codex exec resume --last fails (e.g., session expired), the script falls back to a fresh codex exec with the revision prompt. This loses conversational context but still gets a review.
All iteration feedback is preserved in .codex-review/feedback/iterations/<section-name>/ with a summary file.
For plans with mixed quality across sections:
iterate-section.sh on critical sections until approvedThis avoids burning a holistic review on a document with known local problems.
npx claudepluginhub kroepke/claude-marketplace --plugin codex-plan-reviewerReviews implementation Plan files in parallel using Codex, Gemini, and Claude to analyze validity, gaps, risks, and improvements. Invoke via /plan-review after plan creation.
Orchestrates parallel architecture and experience reviews of implementation plans, scores findings across dimensions like data flow and UX, consolidates ranked fixes for user approval and auto-application. Use after planning, before non-trivial coding.
Runs expert review on plans, specs, or implementation approaches using parallel specialized agents. Presents prioritized, concrete findings as Socratic questions. Automatically invoked during ideate or on-demand.