By huangbaixun
AI Agent Harness Engineering plugin. 19 skills under the harness: namespace covering planning, TDD, brainstorming, debugging, code review, worktree isolation, and verification — built on superpowers v5.1.0 with a harness-delta sidecar layer that integrates features.json, ADRs, and project-local docs.
Sprint feature assignment planner. Analyze the dependency graph and team workload from docs/features.json, output the optimal owner assignment plan + a ready-to-run sprint-kickoff.sh script.
Perform a health audit on the current project's Harness system, generating a score report and optimization recommendations
Generate a canary deployment runbook with risk assessment, staged rollout plan, rollback triggers, and observability checklists. Outputs to deploy/ directory.
Save key decisions and progress from the current session to documentation, for cross-session handoff of long tasks
Initialize the full AI Agent Harness engineering system for the current project
Code quality Inferential Sensor. Invoke in the following scenarios: code quality review before PR merge, architecture convention compliance checks, design evaluation of new business logic, baseline assessment before refactoring. Division of responsibilities with security-reviewer: this Agent focuses on quality/architecture, security-reviewer focuses on security vulnerabilities.
Long-cycle multi-session coding tasks. Invoke when implementing a set of features across multiple sessions: new feature iteration, large-scale refactoring, multi-module collaborative development. Characteristics: task expected to span more than one session, has a features.json feature checklist, requires strict "one feature at a time" constraint to prevent context anxiety. Differs from the main Agent: enforces startup checklist, cross-session state handoff, and a mandatory two-phase Review after each feature (spec compliance → code quality).
Lightweight codebase exploration. Invoke when investigating a specific module, function, or pattern to avoid polluting the main thread context with large file reads. Applicable scenarios: understanding auth/authorization flows, finding reusable utility functions, investigating all implementations of an interface, cross-module dependency analysis.
Professional security code review. Invoke in the following situations: pre-commit review, new authentication/authorization logic, external API integrations, user input handling.
> **Source**: OpenSpec `/opsx:archive` + Handbook S K.6 "Documentation Sync Agent" + Handbook S 2.3 "Structured Handoff Artifacts"
Health check and optimization for existing project Harness setups. Activate when the user mentions "check Harness", "optimize CLAUDE.md", "Agent keeps making mistakes", "Harness health", "audit Harness", "evaluate AI coding environment", "harness audit", "check Agent config", "why won't the Agent follow instructions", "improve Agent effectiveness", "add Harness to existing project", or "legacy optimization". Also use this Skill when the user complains that Agent behavior deviates from expectations, the Agent repeatedly makes the same mistakes, or the project has been around for a while but lacks a systematic Harness framework — diagnose and improve.
You MUST use this before any creative work - creating features, building components, adding functionality, or modifying behavior. Explores user intent, requirements and design before implementation.
> Generate a structured deployment runbook with risk-based canary stages, rollback triggers, and observability checklists.
Use when facing 2+ independent tasks that can be worked on without shared state or sequential dependencies
Matches all tools
Hooks run on every tool call, not just specific ones
Executes bash commands
Hook triggers when Bash tool is used
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
This plugin requires configuration values that are prompted when the plugin is enabled. Sensitive values are stored in your system keychain.
team_nameTeam name (used in sprint reports and features.json labels, e.g. 'Backend Team')
${user_config.team_name}default_tech_stackDefault tech stack (typescript / python / go / java / generic) — auto-selects template during project init
${user_config.default_tech_stack}Modifies files
Hook triggers on file write and edit operations
Modifies files
Hook triggers on file write and edit operations
Uses power tools
Uses Bash, Write, or Edit tools
Uses power tools
Uses Bash, Write, or Edit tools
Shift your core engineering work from "writing code" to "designing environments where AI agents work reliably."
Harness Engineering Plugin packages this methodology into ready-to-use Skills, Commands, and Agents -- install and go, no extra configuration needed.
Step 1: Install
Option A -- Marketplace (recommended, auto-updates)
In a Claude Code conversation:
/plugin marketplace add https://raw.githubusercontent.com/huangbaixun/harness-engineering/main/.claude-plugin/marketplace.json
After subscribing, select it from the plugin list. Claude Code will prompt you when new versions are available.
Option B -- Clone from GitHub
git clone https://github.com/huangbaixun/harness-engineering.git
claude --plugin-dir ./harness-engineering
Good for local evaluation before committing to long-term use.
Option C -- Official Marketplace (coming soon)
# Available after Anthropic review
claude plugins add harness-engineering
Or search "Harness Engineering" in Cowork and click install.
Step 2: Initialize a new project
In Claude Code, say:
"Help me initialize this project's Harness"
After initialization, your project gets:
| File | Purpose |
|---|---|
CLAUDE.md | Project memory layer (<=60 lines), the single source of truth |
init.sh | Session startup script -- runs tool detection before each new session |
.claude/settings.json | Permission control + Hook registration (incl. SessionStart) |
.claude/hooks/session-start.sh | SessionStart Hook: restores progress context on session start |
.claude/hooks/ | Type-check, .env protection, auto-format hooks |
.claude/skills/writing-plans/ | Pre-implementation planning Skill (triggers for >30 min or 3+ file tasks) |
.claude/skills/test-driven-development/ | TDD Skill (enforced RED->GREEN->REFACTOR cycle) |
.claude/skills/verification-before-completion/ | Pre-completion verification Skill (4-layer check before marking done) |
docs/architecture.md | Architecture diagram -- the agent's spatial awareness doc |
docs/claude-progress.json | Cross-session progress tracking |
Verify readiness: bash init.sh -- you should see "Harness ready" on success.
Step 3: Ongoing benefits
The SessionStart Hook automatically restores progress context at the start of every session. The harness:writing-plans / harness:test-driven-development / harness:verification-before-completion workflow Skills engage automatically during implementation, ensuring a complete plan -> implement -> verify loop. Commands let you trigger audits, PR reviews, and entropy scans on demand.
After installation, these Skills trigger automatically based on your intent -- no need to memorize command names. All Skills use the harness: namespace:
npx claudepluginhub huangbaixun/harness-engineering --plugin harness-engineeringSession harness plugin for Claude Code workflow automation
Makes a repo agent-ready: AGENTS.md, boundary tests, CI pipeline, GC scripts — based on OpenAI's harness engineering methodology
Harness Engineering framework - skills, agents, and commands for safe, reviewable, incremental agent-driven development. Includes RPEQ workflow (Research, Plan, Execute, QA), ast-grep setup, and codebase analysis tools.
Harness for Claude Code — skills, /harness:* slash commands, persona subagents, lifecycle hooks, and MCP tools without per-repo `harness setup`. Sibling plugins exist for Cursor, Gemini CLI, and Codex.
Long Task Harness for AI agents - task/feature-driven development with external memory
Long-running agent harness with 5-layer memory architecture, GitHub integration, autonomous batch processing, Agent Teams with ATDD, 9 hooks (safety, quality gates, team coordination), and 6 Agent Skills