Domain-blind core harness with 6 agents, GAN-like adversarial evaluation, profile system, sprint contracts, and quantified 0-5 grading
Generate a postmortem report for the current harness run. Reads evaluation artifacts, metrics, events, and feature history to produce .harness/postmortem.md with Timeline, Score Trends, Failure Analysis, Process Compliance, and Recommendations sections.
Create a release checkpoint -- version bump, changelog, and git tag.
Write a structured handoff file and checkpoint the current session. Use when context is filling or work needs to pause. The next /session resumes from the handoff automatically. Implements Variant B (Reset-Based Compatibility).
Run the harness in continuous coordinator-driven mode. Advances sprint rounds automatically until all required features pass or a blocker stops the run. Use when you want unattended progress.
Run one supervised sprint round for a harness project. Selects the next failing required feature, negotiates a sprint contract with evaluator review, implements it, and evaluates it. Waits for user confirmation between contract review and implementation.
Advance sprint rounds automatically in continuous mode. Selects the next failing feature, dispatches generator and evaluator, updates state.json, and pauses when pause rules fire. Spawn only when execution mode is continuous.
> Thin wrapper -- edit `plugins/harness/skills/harness/roles/evaluator.md` instead.
> Thin wrapper -- edit `plugins/harness/skills/harness/roles/generator.md` instead.
Set up a harness scaffold. Creates .harness/features.json, .harness/progress.md, and .harness/init.md. Spawn once at project start.
Expand an underspecified project goal into a product spec with a finite required feature set and an explicit execution strategy. Spawn when the user's prompt is too short to define a full app.
Executes bash commands
Hook triggers when Bash tool is used
Uses power tools
Uses Bash, Write, or Edit tools
Has parse errors
Some configuration could not be fully parsed
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
Multi-agent sprint orchestration middleware for long-running application development. Two-plugin architecture: a domain-blind core handles all orchestration, and domain skill suites provide profile-specific knowledge.
6 agents: initializer, planner, generator, evaluator, coordinator, releaser.
Dual-runtime: Works with both Claude Code and OpenAI Codex CLI.
Two plugins: harness (core) + harness-sdlc-suite (software delivery domain skills).
Methodology: PDCA (Plan-Do-Check-Act) + two innovations:
The core harness can be used standalone with a custom profile -- no domain skill suite required. Domain suites add pre-built profiles with evaluation criteria, artifact taxonomies, and verification procedures.
References:
# Installs both core harness and SDLC suite
claude plugin install harness@harness
Then reload:
/reload-plugins
# Clone into your project
git clone https://github.com/xuzhijie-ownself/harness.git plugins/harness-repo
# Install (Mac / Linux / Git Bash)
bash plugins/harness-repo/install.sh
# Install (Windows CMD)
plugins\harness-repo\install.bat
The install script copies:
plugins/harness/skills/harness/plugins/harness-sdlc-suite/skills/ (6 skills)If using OpenAI Codex CLI, the plugin auto-loads from the .codex-plugin/plugin.json manifest:
# Clone the repo into your project
git clone https://github.com/xuzhijie-ownself/harness.git plugins/harness-repo
# Codex detects .codex-plugin/plugin.json automatically
# Skills from both plugins are loaded via dual skill paths
codex # start codex in the project directory
# Claude Code marketplace
claude plugin uninstall harness
# Local install (removes core + all domain skills)
bash plugins/harness-repo/install.sh --uninstall
# Marketplace
claude plugin update harness
# Local
cd plugins/harness-repo && git pull && bash install.sh
| Command | Purpose |
|---|---|
/harness:start | Scaffold harness for a new project (run once) |
/harness:session | Run one supervised sprint round |
/harness:run | Continuous coordinator-driven loop (unattended) |
/harness:reset | Checkpoint + handoff when context fills (Variant B) |
/harness:release | Version bump, changelog, and git tag |
| Agent | Spawned by | Reference |
|---|---|---|
| initializer | /harness:start | plugins/harness/skills/harness/roles/initializer.md |
| planner | /harness:start | plugins/harness/skills/harness/roles/planner.md |
| generator | /harness:session, coordinator | plugins/harness/skills/harness/roles/generator.md |
| evaluator | /harness:session, coordinator | plugins/harness/skills/harness/roles/evaluator.md |
| coordinator | /harness:run | plugins/harness/skills/harness/roles/coordinator.md |
| releaser | /harness:release, coordinator | plugins/harness/skills/harness/roles/releaser.md |
The harness follows an adversarial PDCA pattern: the generator produces artifacts (Do) and the evaluator grades them (Check). The generator cannot self-approve; the evaluator cannot edit product artifacts. This separation prevents the common failure mode where a model is too lenient grading its own work. Agent files are thin YAML wrappers -- all instructions live in role files as the single source of truth.
Every evaluation round applies a binary pass/fail Authenticity Gate after domain criteria scoring. This catches technically-competent-but-generic output -- artifacts that score adequately on domain criteria yet show no evidence of project-specific decision-making.
| Dimension | What it checks |
|---|---|
internal_consistency | Artifacts share consistent conventions -- structure, terminology, style form a unified whole |
intentionality | Evidence of project-specific decisions, not unmodified defaults or template output |
craft | Technical fundamentals correct -- hierarchy, structure, naming, formatting |
fitness_for_purpose | Deliverables usable by target audience without additional explanation |
The gate is dual-side: generators apply a pre-implementation checklist (prevention), evaluators apply a post-grading gate (detection). Any dimension failure fails the round regardless of domain scores.
npx claudepluginhub xuzhijie-ownself/harness --plugin harnessSoftware delivery lifecycle domain skills for the harness framework. Provides 5 domain skills (SDLC, EA, BA, SA, Ops) and an index skill for profile routing.
Session harness plugin for Claude Code workflow automation
11 agents, 35 skills, 18 commands, 9 hooks — spec-driven multi-agent orchestration for Claude Code, with optional cross-device semantic memory.
Claude harness - A harness for solo developers (Vibecoders) to handle full-cycle contract development.
Autonomous agent orchestrator for full development lifecycles with zero human input, session budget management, and crash recovery
Harness for Claude Code — skills, /harness:* slash commands, persona subagents, lifecycle hooks, and MCP tools without per-repo `harness setup`. Sibling plugins exist for Cursor, Gemini CLI, and Codex.
Conductor v3 — Multi-agent orchestration with Evaluate-Loop, parallel execution, Board of Directors, and bundled SupaConductor skills for Claude Code