By bencium
Lightweight AI-assisted build harness: scaffold + tasks + acceptance + living roadmap + decision log + mandatory plan/spec gate + verify gate + deploy/rollback. Six-phase loop (roadmap → plan → build → test → reflect → deploy) with ANSI-colored phase banners, .harness/conventions.md for code-quality rules, and explicit memory confirmation after every action.
Log an architectural decision to .harness/archive/ and surface it in memory
Verify → deploy → health check → log. Refuses deploy if /bencium-verify fails.
Append a new feature to tasks.md
Initialize the build harness in this repo (greenfield or brownfield, auto-detected). 5-question interview, single LLM pass, ~60s.
Pick the next task from tasks.md
Auto-loads the bencium-harness project memory at the start of every session. Triggers when the cwd contains a .harness/ folder. Reads .harness/memory.md, .harness/rules.md, and the last 3 entries in .harness/archive/ to restore project context across /clear and session boundaries.
Watches for architectural decisions during a bencium-harness session and nudges the user to log them via /bencium-decide. Triggers when the conversation involves picking between technical alternatives, choosing a library/framework, defining a data model, setting a policy, or making any choice that future-you would want to know the reason for.
Single source of truth for the bencium-harness phase color scheme, banners, emoji, and progress strip. Referenced by every /bencium-* command so visual styling stays consistent across the loop. Read this skill before printing any phase header.
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
A plugin for agentic coding tools like codex and opencode. It gives AI agents steady context, an acceptance checklist that must pass before anything ships, and a memory that survives a context reset (when the chat history is wiped between sessions).
AI agents drift. They forget decisions when the chat is reset. They write code that nobody checks against the original requirements. They ship without verifying, then can't explain why. A blank repo and a clever prompt aren't enough — what's missing is a feedback loop and a memory the agent works inside.
One command — /bencium-init — runs a 5-question interview and produces a complete planning kit in about 60 seconds: product requirements, architecture, a task list, an acceptance checklist, and a .harness/ folder with config, memory, rules, and an archive for decisions.
A workflow without checkpoints forgets itself — and invisible work becomes invisible failure. So every boundary in the loop leaves a written trace: each real decision is recorded. Together they're one advantage: you can see exactly what happened, and you can always pick up where you left off.
| bencium-harness | BMAD-method | Just a context file | |
|---|---|---|---|
| Setup time | ~60 sec | Multi-phase interview | Instant |
| Memory model | Tiered markdown (memory.md + archive) | Heavy generated docs | Static file |
| Verification gate | /bencium-verify against ACCEPTANCE.md | — | — |
| Deploy/rollback | Built-in, refuses on verify fail | — | — |
| Infrastructure needed | None | None | None |
| Survives a context reset | Auto-injected | Re-read manually | Yes |
Solo builders and small teams using agentic coding tools like codex or opencode on real projects — not throwaway demos. Especially valuable for regulated or healthcare work where unverified deploys are not OK, and for any project where you want a paper trail of decisions and a refusal to deploy when acceptance criteria slip.
$ cd ~/new-project
$ git init
$ # open your agentic coding tool
/bencium-init
You get 5 questions (product name, problem, primary user, success metric, stack). One model call later you have:
new-project/
├── README.md
├── CLAUDE.md # project-root agent context, auto-loaded
├── PRD.md # 1-page product requirements
├── ARCHITECTURE.md # 1-page system architecture
├── tasks.md # Now (≤15) / Roadmap split
├── ACCEPTANCE.md # 10 testable acceptance checks
└── .harness/
├── config.yaml # deploy + rollback + verify commands
├── memory.md # hot context, auto-loaded each session
├── rules.md # project non-negotiables
├── glossary.md
├── constraints.md
└── archive/ # NNNN-decision-*.md, NNNN-retro-*.md
Existing projects (ones that already have code) are auto-detected — /bencium-init scans the code and produces a retrospective product spec and architecture without overwriting anything.
Short answer: describe your idea in the chat, then run /bencium-init in the empty project folder. That one command runs a 5-question interview and generates the entire scaffold in a single ~60s pass.
Secret tip: One faster option if you don't want to wait for the full cycle: ask for a throwaway design prototype (a quick one-route scaffold) to eyeball the look-and-feel, then fold the chosen direction back into the proper creative phase. That trades process rigor for seeing pixels sooner.
| Command | What it does |
|---|---|
/bencium-init | Scaffold the harness in the current repo (new or existing project, auto-detected). Also writes .harness/conventions.md. |
/bencium-next | Two-phase: plan/spec (read-only, waits for your approval) → build (executes). The plan IS the spec. For UI tasks it first offers an optional prototype detour — a throwaway one-route preview to eyeball look-and-feel before the spec is written. |
/bencium-feature "desc" | Append a feature to tasks.md ## Roadmap. |
/bencium-promote | Move Roadmap items into Now, or demote stale memory entries to archive. |
/bencium-decide "title" | Log a technical decision (and why) to .harness/archive/. |
/bencium-verify | Walk ACCEPTANCE.md against the actual code. Report pass/fail with evidence. Inspects for gamed tests (weakened or skipped checks) and can delegate a skeptical second opinion to an independent reviewer. |
/bencium-deploy | Verify → deploy → health check → log. Refuses on verify fail. |
/bencium-rollback "reason" | Run the configured rollback command and log the reason. |
/bencium-retro | Postmortem after a failure. Proposes memory and acceptance updates. |
Distinctive, production-grade frontend interfaces with high design quality. Creative, polished code that avoids generic AI aesthetics.
Expert UI/UX design guidance for unique, accessible interfaces. Always-ask-first protocol for visual decisions, colors, typography, layouts.
Production-grade frontend interfaces that avoid generic AI aesthetics. Based on Anthropic's Frontend Designer Skill with bold creative commitment.
Reviews codebases, architectures, PRs, and technical plans for vanity engineering — code built for the developer's ego rather than delivering user or business value.
Design AI-first interfaces that build ongoing relationships through memory, trust evolution, and collaborative planning.
npx claudepluginhub bencium/bencium-harness --plugin bencium-harnessComprehensive skill pack with 66 specialized skills for full-stack developers: 12 language experts (Python, TypeScript, Go, Rust, C++, Swift, Kotlin, C#, PHP, Java, SQL, JavaScript), 10 backend frameworks, 6 frontend/mobile, plus infrastructure, DevOps, security, and testing. Features progressive disclosure architecture for 50% faster loading.
Develop, test, build, and deploy Godot 4.x games with Claude Code. Includes GdUnit4 testing, web/desktop exports, CI/CD pipelines, and deployment to Vercel/GitHub Pages/itch.io.
Comprehensive PR review agents specializing in comments, tests, error handling, type design, code quality, and code simplification
Comprehensive feature development workflow with specialized agents for codebase exploration, architecture design, and quality review
A growing collection of Claude-compatible academic workflow bundles. Covers scientific figures, manuscript writing and polishing, reviewer assessment, citation retrieval, data availability, paper reading, literature search, response letters, paper-to-PPTX conversion, and evidence-grounded Chinese invention patent drafting. Rules are organized as reusable skill folders with explicit workflows and quality checks.
Tools to maintain and improve CLAUDE.md files - audit quality, capture session learnings, and keep project memory current.