By james-traina
Port of ralph-orchestrator to Claude Code's official plugin system. Runs your prompt in a loop until a verification command passes. Solo mode for one session, team mode for parallel agents. Logs telemetry for post-session QA.
System design, API planning, task decomposition, and architectural decisions. Structure over implementation. <example> The lead needs to break down "Build a REST API with auth" into parallelizable tasks with clear file ownership. </example> <example> The team hit iteration 5 with failing integration tests. The architect needs to re-examine the module boundaries and fix the interfaces. </example>
Diagnoses failures, traces bugs, and fixes broken tests. Activated when things go wrong. <example> The verification command is failing and nobody can figure out why. </example> <example> Tests pass individually but fail when run together. The debugger investigates shared state. </example>
Writes production code, implements features, and builds modules. The primary code-writing teammate. <example> A task requires implementing the auth module at src/auth/ following the architect's design. </example> <example> The verification command is failing because a function returns the wrong type. The implementer fixes it. </example>
Strategic gap analysis and plan generation. Compares specs against the current codebase and produces a prioritised IMPLEMENTATION_PLAN.md. Use for plan mode and for complex priority/scope reasoning during build. <example> No IMPLEMENTATION_PLAN.md exists yet. The planner scans specs/* and src/* to find what is missing, partially implemented, or inconsistent, then writes a prioritised task checklist. </example> <example> Midway through a build loop the scope has drifted. The planner audits IMPLEMENTATION_PLAN.md against reality and identifies which completed items aren't actually done and which new items should be added. </example>
Code review, correctness checking, and quality assessment. Reads and critiques, does not write code. <example> An implementer has finished their task and the code needs review before marking complete. </example> <example> The team is on iteration 8 and the verification command still fails. The reviewer audits the codebase for systemic issues. </example>
Modifies files
Hook triggers on file write and edit operations
Uses power tools
Uses Bash, Write, or Edit tools
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
A Claude Code plugin that runs your prompt in a loop until the work is genuinely done — verified by a shell command, not just Claude saying so.
The core idea: instead of one long session where the model tries to hold everything in context, you run many short sessions. Each one reads the current state from disk, does one unit of work, commits, and exits. The loop re-invokes it. Fresh context every time. State lives in files and git history, not in the model's memory.
This approach comes from Geoffrey Huntley's Ralph technique, ported here to Claude Code's native plugin system.
Long coding sessions degrade. The model starts strong, then slowly loses track of earlier constraints as the context fills up. It might re-implement something it already built, or forget that a test was passing and break it. By iteration 15 in a single session, you're often fighting accumulated drift as much as the actual problem.
What drift looks like in practice: Claude implements an auth module in message 10, then in message 40, facing a related bug, re-reads the file and doesn't recognize it as something it already wrote. Or it accepts a constraint early ("don't use global state") then violates it later because that constraint has been pushed far enough back in context that it's no longer in the effective attention window. Context windows are large but not free — as they fill with tool outputs, intermediate reasoning, and error messages, early instructions lose ground.
Short loops with fresh context don't have this property. Every iteration, Claude re-reads the objective, sees what's in git, and works from that ground truth rather than from its fading recollection of the last 50 messages. You re-read specs on every iteration, which costs something in tokens, but the consistency is worth it for anything non-trivial.
The dual gate is the other piece. Without some form of backpressure, Claude will claim it's done when it isn't. It's not deceptive — it's just that "the feature is now complete" is a natural way to narrate work, not necessarily a factual claim. The gate separates an intentional completion claim from incidental completion-sounding narration, then independently checks that the claim is true.
Inside Claude Code, run each command separately:
/plugin marketplace add James-Traina/science-plugins
/plugin install ralpha-team@science-plugins
Both modes require jq (brew install jq on macOS, apt install jq on Linux). Team mode additionally requires the experimental flag in ~/.claude/settings.json:
{
"env": {
"CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS": "1"
}
}
Before writing any code, run a planning loop. It scans your specs and existing codebase, compares them, and produces IMPLEMENTATION_PLAN.md — a flat checklist of tasks, each with explicit file scope and a done criterion.
Why plan separately? Without a plan, each build iteration has to re-reason the scope from scratch. The model reads the codebase, figures out what's missing, picks something to work on, and implements it. Next iteration, it reads the codebase again and might pick a different thing, or pick the same thing having forgotten progress was already made. A plan file gives fresh-context iterations something stable to draw from: pick the first unchecked task, do it, check it off.
The plan command runs in a short loop (1–3 iterations is usually enough) because the output is a file, not code. Once IMPLEMENTATION_PLAN.md exists, both solo and team modes read it automatically and use its unchecked items as the task queue.
/ralpha-team:plan "Build a payment processing service with Stripe, webhooks, and idempotency" \
--max-iterations 3
One Claude session, looping. The default for most things. Good for bug fixes, single features, refactors, anything where the work is roughly sequential.
Each iteration sees the previous iteration's work via git log and file state. The model can't remember what it said last time, but it can see what it committed. Git history is the right kind of memory here — concrete and verifiable, not approximate and decaying.
The loop continues until both the completion promise and verification command pass on the same iteration, or until --max-iterations is reached. Between iterations, the failure output — exactly what went wrong — is injected into the next iteration's context so Claude can respond to it, without carrying that failure report in memory for the entire session.
/ralpha-team:solo "Fix the token refresh race condition in auth/session.py" \
--completion-promise 'FIXED' \
--verify-command 'pytest tests/test_auth.py -v' \
--max-iterations 15
npx claudepluginhub james-traina/science-plugins --plugin ralpha-teamAI research assistant for quantitative social science. Ambient hooks detect research context and route to 10 specialized agents covering structural econometrics, causal inference, game theory, identification, Monte Carlo studies, and reproducible pipelines.
Gives Claude a real math engine. Ask a math or science question and Claude translates it to Wolfram Language, runs it through wolframscript, and hands back the exact answer — symbolic algebra, calculus, plotting, statistics, and more. No special syntax needed.
Routes mechanical coding tasks — test writing, documentation, formatting, and code generation — to OpenAI Codex instead of Claude, cutting token costs on work that doesn't need deep reasoning.
Comprehensive feature development workflow with specialized agents for codebase exploration, architecture design, and quality review
Comprehensive skill pack with 66 specialized skills for full-stack developers: 12 language experts (Python, TypeScript, Go, Rust, C++, Swift, Kotlin, C#, PHP, Java, SQL, JavaScript), 10 backend frameworks, 6 frontend/mobile, plus infrastructure, DevOps, security, and testing. Features progressive disclosure architecture for 50% faster loading.
Comprehensive PR review agents specializing in comments, tests, error handling, type design, code quality, and code simplification
Develop, test, build, and deploy Godot 4.x games with Claude Code. Includes GdUnit4 testing, web/desktop exports, CI/CD pipelines, and deployment to Vercel/GitHub Pages/itch.io.
Access thousands of AI prompts and skills directly in your AI coding assistant. Search prompts, discover skills, save your own, and improve prompts with AI.
Upstash Context7 MCP server for up-to-date documentation lookup. Pull version-specific documentation and code examples directly from source repositories into your LLM context.