By james-traina
Routes mechanical coding tasks — test writing, documentation, formatting, and code generation — to OpenAI Codex instead of Claude, cutting token costs on work that doesn't need deep reasoning.
Uses power tools
Uses Bash, Write, or Edit tools
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
Routes mechanical coding tasks to OpenAI Codex, cutting Anthropic token usage by 60–85% on eligible work. You don't change how you use Claude Code.
Claude Sonnet 4.6 costs ~$3/M input tokens and ~$15/M output tokens. OpenAI's Codex (via gpt-5.3-codex) costs a fraction of that. For tasks like "write a unit test for this function" or "add JSDoc to this class", the output quality is virtually indistinguishable, but the cost difference is 20–40x.
The tricky part is figuring out which tasks qualify. "Write unit tests for getUserById" clearly does. "Why is my authentication middleware slow?" clearly doesn't — it needs reasoning, context, and judgment. Most tasks fall somewhere in between, and classifying them correctly in real time (before any tokens are spent) requires a scoring approach that weighs competing signals.
That's what this plugin does. A bash hook fires on every message before Claude processes it. If the task scores high enough on mechanical-work signals and low enough on reasoning signals, it gets routed to Codex instead. Claude receives the pre-computed answer and relays it, spending maybe 5% of the tokens it would have used computing the same answer itself.
User types a message
│
▼
UserPromptSubmit hook fires (bash, zero Claude tokens)
│
├── Parse hook JSON input (message, cwd, session_id)
│
├── Check budget-aware threshold (is context window filling up?)
│
├── Check codex CLI availability (if missing: fall through to Claude)
│
├── classify.sh scores the prompt with weighted pattern matching
│
│ DELEGATE (score ≥ 20 AND category matched)
│ ├── codex-exec.sh picks expert persona for the category
│ ├── Detects project language from CWD (package.json, go.mod, etc.)
│ ├── Builds full prompt: expert persona + project context + task
│ ├── Runs `codex exec` with model + reasoning effort + sandbox
│ │ (120s timeout; falls through to Claude on failure)
│ ├── token-tracker.sh logs the delegation (fire-and-forget)
│ └── Hook returns additionalContext with Codex output + relay instructions
│
│ CLAUDE (score ≤ -20, or UNSURE between -20 and 20)
│ └── Fall through: Claude handles the task normally
│
▼
Claude receives additionalContext (if DELEGATE) or the raw message (otherwise)
│
▼
[DELEGATE path]: Claude reads the Codex answer and relays it with [via Codex · category] prefix
[CLAUDE path]: Claude answers normally
Claude Code's UserPromptSubmit hook runs before the AI model is invoked. It receives the raw message as JSON and can inject additionalContext — extra text prepended to what Claude sees.
Classification runs entirely in bash, spending zero tokens. Codex execution runs outside the Claude inference loop. Claude's only job on delegated tasks is to clean up Markdown and add the attribution prefix: roughly 50 output tokens instead of 500–2000.
If the hook fails for any reason (codex not installed, timeout, network error), it exits without output. Claude Code treats an empty hook response as "no intervention" and falls through to Claude normally.
scripts/classify.sh assigns a numeric score to every prompt. Positive scores push toward delegation; negative scores push toward Claude.
DELEGATE if: score ≥ 20 AND a category was matched
CLAUDE if: score ≤ -20
UNSURE otherwise: Claude handles it (conservative fallback)
The UNSURE zone between -20 and 20 intentionally keeps borderline tasks with Claude. Routing a debugging task to Codex wastes money on a bad answer; keeping a test-writing task with Claude just costs a bit more. When in doubt, the system keeps things with Claude.
npx claudepluginhub james-traina/science-plugins --plugin claude-codexAI research assistant for quantitative social science. Ambient hooks detect research context and route to 10 specialized agents covering structural econometrics, causal inference, game theory, identification, Monte Carlo studies, and reproducible pipelines.
Gives Claude a real math engine. Ask a math or science question and Claude translates it to Wolfram Language, runs it through wolframscript, and hands back the exact answer — symbolic algebra, calculus, plotting, statistics, and more. No special syntax needed.
Port of ralph-orchestrator to Claude Code's official plugin system. Runs your prompt in a loop until a verification command passes. Solo mode for one session, team mode for parallel agents. Logs telemetry for post-session QA.
Hand token-heavy mechanical work (batch edits, scaffolding, refactors, test generation, plotting scripts) from Claude to Codex CLI.
Intelligent model routing for Claude Code - routes queries to optimal Claude model (Haiku/Sonnet/Opus) based on complexity, with persistent knowledge system, context forking, and multi-turn awareness
Cross-agent review workflow: Claude implements, Codex reviews
45% cost reduction measured. Cache expiry prevention, SubTask auto-delegation, zero-cost context restoration, real-time cost dashboard. The only Claude Code plugin built from CC source analysis.
PROJECT.md-first autonomous development with hybrid auto-fix documentation. 8-agent pipeline, auto-orchestration, docs auto-update on commit (true vibe coding). Knowledge base system with 90% faster repeat research. Strict mode enforces SDLC best practices automatically. Works for ANY Python/JavaScript/TypeScript/Go project.
Documentation, research, test generation, document generation, and prompt optimization tools