Search everything...

Stats

Actions

Available In

claude-codex

Name: claude-codex
Author: james-traina

By james-traina

Routes mechanical coding tasks — test writing, documentation, formatting, and code generation — to OpenAI Codex instead of Claude, cutting token costs on work that doesn't need deep reasoning.

npx claudepluginhub james-traina/science-plugins --plugin claude-codex

Popularity

Stars

Med: 0·Avg: 285

Installs

Med: 0·Avg: 1

What's Inside

Slash Commands2

codex

/codex

Delegate a coding task explicitly to OpenAI Codex with optional sandbox and model control.

savings

/savings

Show a summary of token savings achieved by claude-codex delegation.

Agents1

codex-agent

/codex-agent

Use this agent to dispatch a self-contained coding task to the OpenAI Codex CLI and return the result with attribution. This agent should be used when: the user explicitly asks to use Codex or delegate a task; a mechanical task (code generation, test writing, documentation, refactoring, formatting) has failed 2+ times and a fresh start is useful; the session is long and offloading work would preserve Claude budget; the user invokes /codex directly. <example> Context: The user has asked Claude to write unit tests three times and keeps getting incomplete coverage. user: "Write comprehensive unit tests for the parseConfig function" assistant: "I'll use the codex-agent to get a fresh perspective on the test suite." <commentary> Repeated failure on a mechanical task is a good signal for delegation. </commentary> </example> <example> Context: The user explicitly requests Codex. user: "@codex generate JSDoc for all exported functions in src/api.ts" assistant: "I'll use the codex-agent to handle this documentation task." <commentary> Explicit @codex mention is a direct trigger for the codex-agent. </commentary> </example> Do NOT use for: security review, auth design, tasks requiring the full conversation history.

Skills1

delegate

/delegate

Use this skill when the user explicitly routes work to Codex or wants an independent verification pass. Triggers on: "@codex", "delegate to codex", "use codex for this", "send this to codex", "save tokens", "use the cheaper model", "let codex handle it", "get a second opinion", "sanity check via codex", "verify this independently". Also triggers mid-reasoning when Claude wants an independent agent to confirm a conclusion before presenting it. Do not trigger for tasks the UserPromptSubmit hook already delegated (those arrive pre-tagged in additionalContext), security review, auth design, or tasks that require the full conversation history.

Hooks1

Event Hooks

2 hooks across 2 events

Stats

Version1.1.0

Stars0

MaintenanceExcellent

LicenseMIT

Last CommitMar 10, 2026

AddedMar 22, 2026

Actions

View on GitHub View README Plugin Marketplace JSON Homepage

Own this plugin?

Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).

Available In

science-plugins

Safety Signals

Caution

Uses power tools

Uses Bash, Write, or Edit tools

README

claude-codex

Routes mechanical coding tasks to OpenAI Codex, cutting Anthropic token usage by 60–85% on eligible work. You don't change how you use Claude Code.

The problem it solves

Claude Sonnet 4.6 costs ~$3/M input tokens and ~$15/M output tokens. OpenAI's Codex (via gpt-5.3-codex) costs a fraction of that. For tasks like "write a unit test for this function" or "add JSDoc to this class", the output quality is virtually indistinguishable, but the cost difference is 20–40x.

The tricky part is figuring out which tasks qualify. "Write unit tests for getUserById" clearly does. "Why is my authentication middleware slow?" clearly doesn't — it needs reasoning, context, and judgment. Most tasks fall somewhere in between, and classifying them correctly in real time (before any tokens are spent) requires a scoring approach that weighs competing signals.

That's what this plugin does. A bash hook fires on every message before Claude processes it. If the task scores high enough on mechanical-work signals and low enough on reasoning signals, it gets routed to Codex instead. Claude receives the pre-computed answer and relays it, spending maybe 5% of the tokens it would have used computing the same answer itself.

How it works

The full pipeline

User types a message
        │
        ▼
UserPromptSubmit hook fires (bash, zero Claude tokens)
        │
        ├── Parse hook JSON input (message, cwd, session_id)
        │
        ├── Check budget-aware threshold (is context window filling up?)
        │
        ├── Check codex CLI availability (if missing: fall through to Claude)
        │
        ├── classify.sh scores the prompt with weighted pattern matching
        │
        │     DELEGATE (score ≥ 20 AND category matched)
        │     ├── codex-exec.sh picks expert persona for the category
        │     ├── Detects project language from CWD (package.json, go.mod, etc.)
        │     ├── Builds full prompt: expert persona + project context + task
        │     ├── Runs `codex exec` with model + reasoning effort + sandbox
        │     │       (120s timeout; falls through to Claude on failure)
        │     ├── token-tracker.sh logs the delegation (fire-and-forget)
        │     └── Hook returns additionalContext with Codex output + relay instructions
        │
        │     CLAUDE (score ≤ -20, or UNSURE between -20 and 20)
        │     └── Fall through: Claude handles the task normally
        │
        ▼
Claude receives additionalContext (if DELEGATE) or the raw message (otherwise)
        │
        ▼
[DELEGATE path]: Claude reads the Codex answer and relays it with [via Codex · category] prefix
[CLAUDE path]:   Claude answers normally

Why the hook approach works

Claude Code's UserPromptSubmit hook runs before the AI model is invoked. It receives the raw message as JSON and can inject additionalContext — extra text prepended to what Claude sees.

Classification runs entirely in bash, spending zero tokens. Codex execution runs outside the Claude inference loop. Claude's only job on delegated tasks is to clean up Markdown and add the attribution prefix: roughly 50 output tokens instead of 500–2000.

If the hook fails for any reason (codex not installed, timeout, network error), it exits without output. Claude Code treats an empty hook response as "no intervention" and falls through to Claude normally.

The routing algorithm

Scoring mechanics

scripts/classify.sh assigns a numeric score to every prompt. Positive scores push toward delegation; negative scores push toward Claude.

DELEGATE if: score ≥ 20  AND  a category was matched
CLAUDE   if: score ≤ -20
UNSURE   otherwise: Claude handles it (conservative fallback)

The UNSURE zone between -20 and 20 intentionally keeps borderline tasks with Claude. Routing a debugging task to Codex wastes money on a bad answer; keeping a test-writing task with Claude just costs a bit more. When in doubt, the system keeps things with Claude.

Positive signals (add to score)

View full README on GitHub

claude-codex

Popularity

What's Inside

Confidence

README

claude-codex

The problem it solves

How it works

The full pipeline

Why the hook approach works

The routing algorithm

Scoring mechanics

Positive signals (add to score)

Similar Plugins

codex-delegate

claude-router

codex-review

claude-code-token-saver

autonomous-dev

development-productivity

More by james-traina

compound-science

wolfram-hart

ralpha-team

claude-codex

The problem it solves

How it works

The full pipeline

Why the hook approach works

The routing algorithm

Scoring mechanics

Positive signals (add to score)

Popularity

Health & Quality

More by james-traina

compound-science

wolfram-hart

ralpha-team

Similar Plugins

codex-delegate

claude-router

codex-review

claude-code-token-saver

autonomous-dev

development-productivity