By usetheodev
Autonomous + rigorous Codex-as-judge layer for the plan cycle ecosystem (DISCOVER, PLAN, IMPLEMENT, CODE-QUALITY, REVIEW). Breaks the Claude-only monoculture by adding GPT-Codex as an orthogonal LLM jury that re-validates each cycle artifact against its golden-rule contract.
Run Codex as orthogonal judge on a `/to-plan` plan
Run all 4 judge stages sequentially against a slice
Run Codex as orthogonal judge on a `/discover-plan` blueprint
Review-of-review — Codex audits the consolidated `/review` report
Show judge-codex commands + their golden-rule mapping
Codex-side specialist that audits a `/discover-plan` blueprint against `discover-blueprint-golden-rule.md`. Invoked by the companion script — NOT a Claude sub-agent the main thread spawns directly.
Review-of-review — Codex audits the consolidated `/review` report itself against the per-agent finding files. Catches aggregator bugs and meta-defects (e.g., consolidate_findings.py silently dropping files due to YAML parse errors).
Codex-side specialist that audits a `/implement` cycle output against `cycle-implement.md`. Reads the implementation log + the actual git history of the slice.
Proactively use when the user runs any `/judge-codex:*` command. Thin forwarding wrapper that hands a `plan` cycle artifact (blueprint / plan / implementation log / review report) plus its golden-rule contract to Codex via the companion script. Returns Codex stdout verbatim.
Codex-side specialist that audits a `/to-plan` plan against `plan-confidence-golden-rule.md`. Adds semantic depth on top of the deterministic `plan-confidence` structural check.
Internal helper contract for calling the codex-companion-judge runtime. Used by `judge-codex-jury` only.
Locates `plan` cycle artifacts (blueprints, plans, implementation logs, review reports) in the consumer repository. Internal helper for the companion script.
Internal prompting contract for assembling the Codex judge prompt. Used by the companion script to template the system + golden-rule + artifact bundle.
Uses power tools
Uses Bash, Write, or Edit tools
No model invocation
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
Executes directly as bash, bypassing the AI model
Executes directly as bash, bypassing the AI model
plan cycle ecosystemAutonomous + rigorous orthogonal-LLM jury for the
plancycle pipeline. Breaks the Claude-only monoculture by adding GPT-Codex as an independent reviewer that re-validates each cycle artifact against its golden-rule contract.
plan already runs /review with 5–7 specialized sub-agents in parallel (architecture, tests, wiring, cross-validation, domain). They are all Claude — same model family, same training, same blind spots. Concurrency hazards, side-channel risks, and certain classes of subtle bugs get systematically under-weighted because every reviewer shares the same priors.
judge-codex adds a second LLM family (GPT-Codex via the official codex CLI) as an orthogonal jury that consumes the same artifact and emits an independent verdict using plan's canonical verdict vocabulary. When Claude and Codex disagree, the disagreement itself is the signal — and the loop halts for human adjudication.
Slash commands (one per cycle stage):
| Command | When | What it judges |
|---|---|---|
/judge-codex:discover <slug> | After /discover-plan produces a blueprint | ≥2-source evidence rule, fabricated_citation, empty corners, cross-cutting comparison rigor |
/judge-codex:plan <slug> | After /to-plan (typically also after /plan-confidence) | Coverage Matrix semantic completeness, Goal SMART quality, ADR alternative honesty, TDD discipline in bug-fixes, citation fabrication |
/judge-codex:implementation <slug> | After /implement emits IMPLEMENTATION_COMPLETE | Wiring triad depth (caller + integration test + runtime metric), TDD RED→GREEN→REFACTOR audit-trail, dead code in production paths |
/judge-codex:final <slug> | After /review emits the consolidated report | Review-of-review: does the consolidated finding-set itself hold up? Catches the consolidate_findings.py YAML-bug class of meta-defects |
/judge-codex:auto <slug> | End-to-end | Runs all 4 above sequentially against artifacts produced by a single slice |
/judge-codex:setup | Once per environment | Verifies codex CLI install + login; offers to install if missing |
/judge-codex:status | Anytime | Lists background jobs |
/judge-codex:help | Anytime | Shows commands + their golden-rule mapping |
Sub-agent:
| Name | Role |
|---|---|
judge-codex-jury | Thin forwarding wrapper that hands a cycle artifact + golden-rule contract to Codex via codex-companion-judge.mjs. Returns Codex stdout verbatim. |
plan)SHIPPABLE — 90–100 green
SHIPPABLE_WITH_CAVEATS — 70–89 caveats logged
NEEDS_REVISION — 50–69 loop back
FAIL_SOFT — 49 soft cap blew
FAIL_HARD — 49 hard cap blew → block downstream
INVALID — 0 structural integrity broken
Different from
codex-plugin-cc: that plugin uses a binaryapprove/needs-attentionschema (general-purpose code review). judge-codex outputs structured findings keyed byplan's cycle-specific golden rules.
# 1. Install Codex CLI (one-time)
npm install -g @openai/codex
codex login
# 2. Install this plugin
/plugin marketplace add paulohenriquevn/judge-codex-plugin-cc
/plugin install judge-codex@judge-codex
/reload-plugins
# 3. Verify
/judge-codex:setup
# 4. Use after any plan cycle
/judge-codex:discover my-slug
/judge-codex:plan my-slug
/judge-codex:implementation my-slug
/judge-codex:final my-slug
# OR end-to-end after a slice completes
/judge-codex:auto my-slug
planplan's existing /review (5-7 Claude sub-agents): ───────────────────┐
├── architecture-reviewer │
├── test-auditor │
├── wiring-validator │
├── cross-validation │
└── domain-specific (1-3) │
│
NEW: /judge-codex (Codex orthogonal jury): ─────────────────────────┤
├── /judge-codex:discover (blueprint judge) │
├── /judge-codex:plan (plan judge) ├── verdict
├── /judge-codex:implementation │
└── /judge-codex:final (review-of-review) │
│
When Claude and Codex agree → confidence ↑ │
When they disagree → loop halts for human │
┘
npx claudepluginhub usetheodev/judge-codex-plugin-cc --plugin judge-codexComplete collection of battle-tested Claude Code configs from an Anthropic hackathon winner - agents, skills, hooks, and rules evolved over 10+ months of intensive daily use
Modern R development skills for Claude Code - tidyverse patterns, rlang metaprogramming, Bayesian inference, performance optimization, and more
Unity Development Toolkit - Expert agents for scripting/refactoring/optimization, script templates, and Agent Skills for Unity C# development
Complete creative writing suite with 10 specialized agents covering the full writing process: research gathering, character development, story architecture, world-building, dialogue coaching, editing/review, outlining, content strategy, believability auditing, and prose style/voice analysis. Includes genre-specific guides, templates, and quality checklists.
Comprehensive .NET development skills for modern C#, ASP.NET, MAUI, Blazor, Aspire, EF Core, Native AOT, testing, security, performance optimization, CI/CD, and cloud-native applications
20 SEO/GEO skills and 5 commands on one shared contract for keyword research, content creation, technical audits, schema markup, monitoring, quality gates, entity truth, and campaign memory.