By robanderson
Run a Farnsworth Loop tournament in one of two modes. Single pass: produce N independent solutions in parallel, then a blind Opus reviewer scores them, lists pros and cons, ranks them, and names a winner. Two pass: the same first round, but the Opus reviewer also distils what worked and what failed into guidance; the losing attempts are discarded, a second round of N fresh attempts is run with that guidance (positives to emulate, pitfalls to avoid), the saved round one winner is added back, and a final Opus ranker picks the overall winner. First ask the user which model to use for the attempts: Anthropic Opus/Sonnet/Haiku, a GLM z.ai model (glm-5.2/glm-5.1/glm-4.7/glm-4.5-air, run via the glm CLI), a local on-device MLX model (free, via the omlx server; list is dynamic), an OpenAI model via the codex exec CLI (gpt-5.5, pick a reasoning effort), a MiniMax M-series model (minimax-m3 via the MiniMax endpoint), Top Mixed (an even split across opus/glm-5.2/codex-high), or Mixed per-attempt; the blind reviewer/ranker is always Anthropic Opus. Trigger on the sigil @@FL[:N][:M[:Z]] (e.g. @@FL:5, @@FL:5:2, bare @@FL) or the prose marker 'farnsworth loop:N[:M[:Z]]', case-insensitive with optional spaces. N (attempts/round) is optional and may be inferred from a prose model spec like '2 opus, 2 glm 5.2, 1 codex high' or the Top Mixed preset; M = passes (1 single, 2 two); Z = grand loops: Z=1 (or omitted) is the isolated tournament, Z>=2 (capped at Z_MAX=5) runs an UNATTENDED chain that per loop runs a tournament, implements the winning proposal into your real repo on a new FL-<loop>-<random7> branch via the Opus farnsworth-implementer agent, runs fail-closed verify, and opens one PR (draft+needs-human on failure) — never auto-merged. E.g. 'do abc :farnsworth loop:5' or 'do abc @@FL:5:2'.
Farnsworth Loop CODEX worker for OpenAI models via the `codex exec` CLI. A command runner: it executes the single benign shell command handed to it (which writes a brief file and runs the bundled farnsworth-loop codex runner script, performing the attempt on an OpenAI model via `codex exec`) and relays the result. It NEVER solves the task itself. One generic agent handles every codex effort level — the exact model/effort is in the command. Invoked only by the farnsworth-loop tournament; not a general-purpose agent.
Farnsworth Loop GLM worker for the z.ai model glm-4.5-air. A command runner: it executes the single benign shell command handed to it (which writes a brief file and runs the bundled farnsworth-loop GLM runner script, performing the attempt on glm-4.5-air via z.ai) and relays the result. It NEVER solves the task itself. Invoked only by the farnsworth-loop tournament; not a general-purpose agent.
Farnsworth Loop GLM worker for the z.ai model glm-4.7. A command runner: it executes the single benign shell command handed to it (which writes a brief file and runs the bundled farnsworth-loop GLM runner script, performing the attempt on glm-4.7 via z.ai) and relays the result. It NEVER solves the task itself. Invoked only by the farnsworth-loop tournament; not a general-purpose agent.
Farnsworth Loop GLM worker for the z.ai model glm-5.1. A command runner: it executes the single benign shell command handed to it (which writes a brief file and runs the bundled farnsworth-loop GLM runner script, performing the attempt on glm-5.1 via z.ai) and relays the result. It NEVER solves the task itself. Invoked only by the farnsworth-loop tournament; not a general-purpose agent.
Farnsworth Loop GLM worker for the z.ai model glm-5.2. A command runner: it executes the single benign shell command handed to it (which writes a brief file and runs the bundled farnsworth-loop GLM runner script, performing the attempt on glm-5.2 via z.ai) and relays the result. It NEVER solves the task itself. Invoked only by the farnsworth-loop tournament; not a general-purpose agent.
Benchmark generation throughput (cold vs hot tok/s) for every model the farnsworth-loop system can call (Anthropic / GLM / local MLX / codex / MiniMax). Two workload profiles — light (tiny paragraph) and heavy (>5k-token input context + long >5k-token output, representative of coding/agentic work). Thin wrapper over bin/fl-bench.mjs. Use when the user asks to benchmark model speed, measure tokens/second, compare cold vs hot throughput across providers, or run /fl-bench.
Run a Farnsworth Loop tournament in one of two modes. The sigil is @@FL[:N][:M[:Z]] — N (optional) = attempts per round, M = passes (1 single, 2 two), Z = grand loops (Z>=2 = an UNATTENDED chain that, per loop, runs a full tournament, implements the winning proposal into your real repo on a new FL-<loop>-<random7> branch, runs fail-closed verify, and opens one PR — never auto-merged; Z=1 or omitted = today's isolated tournament; Z capped at Z_MAX=5); N may be inferred from a prose model spec like '2 opus, 2 glm 5.2, 1 codex high' (sum of counts = N, the items become the per-attempt Mixed assignment) or the Top Mixed preset ('top mixed' + N spread over opus/glm-5.2/codex-high), and bare @@FL falls back to the interactive model gate. First ask the user which model quality to use for the attempts (Anthropic Opus, Sonnet, Haiku; a GLM z.ai model via the glm CLI; a free local on-device MLX model via the omlx server; or Mixed per-attempt). SINGLE PASS: produce N independent solutions in parallel, then a blind Opus reviewer scores them, lists pros and cons, ranks them, and names a winner. TWO PASS: the same first round, but the Opus reviewer also distils what worked and what failed into guidance; the losing attempts are discarded, a second round of N fresh attempts is run with that guidance (positives to emulate, pitfalls to avoid), the saved round one winner is added back, and a final Opus ranker picks the overall winner. Trigger whenever the user's message contains a sigil of the form @@FL:N:M (for example @@FL:5 , @@FL:5:2 , @@fl:7:2 ), where N is the number of attempts per round and M is the number of passes (omitted or 1 = single pass, 2 = two pass); the text before the sigil is the task. ALSO trigger on the prose marker 'farnsworth loop:N' (single pass) or 'farnsworth loop:N:2' (two pass), e.g. 'do abc :farnsworth loop:5' or 'do abc: farnsworth loop:5:2'. All forms are case-insensitive with optional spaces around the colons. Also trigger when the user clearly asks for a farnsworth loop / generate-and-rank tournament even without a marker.
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimUses power tools
Uses Bash, Write, or Edit tools
Based on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
npx claudepluginhub robanderson/farnsworth-loop --plugin farnsworth-loopAccess thousands of AI prompts and skills directly in your AI coding assistant. Search prompts, discover skills, save your own, and improve prompts with AI.
Complete developer toolkit for Claude Code
Intelligent draw.io diagramming plugin with AI-powered diagram generation, multi-platform embedding (GitHub, Confluence, Azure DevOps, Notion, Teams, Harness), conditional formatting, live data binding, and MCP server integration for programmatic diagram creation and management.
Feature development with code-architect/explorer/reviewer agents, CLAUDE.md audit and session learnings, and Agent Skills creation with eval benchmarking from Anthropic.
Orchestrate multi-agent teams for parallel code review, hypothesis-driven debugging, and coordinated feature development using Claude Code's Agent Teams
Production-grade engineering skills for AI coding agents — covering the full software development lifecycle from spec to ship.