By sdg9
Spec-driven autonomous-development harness: OpenSpec proposals → frozen holdout contracts → tier-aware adversarial review.
After a story branch is merged into base, archive the change folder, remove the worktree, and delete the local branch.
Run an approved OpenSpec story through the autonomous harness end-to-end (worktree → holdouts → plan → implement → verify → review).
Draft a new OpenSpec proposal — proposal.md + specs/*.delta.md + tasks.md — under openspec/changes/<name>/.
Architectural review for the baton-harness adversarial Phase 4d. Reads the proposal + diff and emits JSON-line findings on coupling, abstraction violations, scalability, and design fit. Use when the harness orchestrator dispatches an adversarial review.
Elite code review for the baton-harness adversarial Phase 4d. Reads the proposal + diff and emits JSON-line findings on bugs, quality, security smells, and missing test coverage. Use when the harness orchestrator dispatches an adversarial review.
Security review for the baton-harness adversarial Phase 4d. Reads the proposal + diff and emits JSON-line findings on authentication, authorization, input validation, secrets, and untrusted-data flows. Use when the harness orchestrator dispatches an adversarial review.
Uses power tools
Uses Bash, Write, or Edit tools
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
Spec-driven autonomous-development tooling: a harness that drives an LLM through holdout-test generation, planning, implementation, and adversarial review — paired with a local-first kanban dashboard for launching and monitoring runs.
This repository contains two packages:
packages/harness — @baton-tools/harness on npm. A TypeScript CLI plus Claude Code plugin (baton-harness) that orchestrates the autonomous-development loop. Project-agnostic core; per-repo config via harness.config.ts.packages/workbench — @baton-tools/workbench on npm. A local-first browser kanban for launching and monitoring Claude/Codex terminal sessions against allowlisted projects. Reads OpenSpec changes natively; lights up extra controls when a project has the harness installed.The two packages can be used independently. The workbench has optional awareness of the harness — projects without harness.config.ts still get a working kanban; projects with it get harness-driven actions on the board.
# Install all package dependencies via pnpm workspaces
pnpm install
# Build everything
pnpm build
# Run all tests
pnpm test
# Work in one package
pnpm --filter @baton-tools/harness <script>
pnpm --filter @baton-tools/workbench <script>
We use changesets for per-package versioning and publishing.
See RELEASING.md for the full scriptable recipe — designed so an LLM or human can take a "publish a patch of the harness" instruction end-to-end without interactive prompts.
The two packages version independently. Note one cascading rule: bumping @baton-tools/harness also produces a patch bump of @baton-tools/workbench because the workbench depends on the harness via workspace:* (per updateInternalDependencies: patch in .changeset/config.json).
MIT
npx claudepluginhub sdg9/baton --plugin baton-harnessAccess thousands of AI prompts and skills directly in your AI coding assistant. Search prompts, discover skills, save your own, and improve prompts with AI.
Complete developer toolkit for Claude Code
Intelligent draw.io diagramming plugin with AI-powered diagram generation, multi-platform embedding (GitHub, Confluence, Azure DevOps, Notion, Teams, Harness), conditional formatting, live data binding, and MCP server integration for programmatic diagram creation and management.
Feature development with code-architect/explorer/reviewer agents, CLAUDE.md audit and session learnings, and Agent Skills creation with eval benchmarking from Anthropic.
Orchestrate multi-agent teams for parallel code review, hypothesis-driven debugging, and coordinated feature development using Claude Code's Agent Teams
Production-grade engineering skills for AI coding agents — covering the full software development lifecycle from spec to ship.