Analyzes tasks to propose and execute optimal orchestration patterns — orchestrates sequential/parallel/team/ralph-loop agents, then runs a post-execution verification gate for spec-originated work
Harness maturity assessment — evaluates Claude Code harness using a 6-axis 24-item checklist and 2x3 analysis matrix
Audits all context documents Claude loads (CLAUDE.md, MEMORY.md, skills, agents, plugins) for outdated claims, contradictions, and risky wording
Design and build agent team architecture (delegates to harness-factory). Run when the user asks to 'generate team', 'build agent team', 'build agent pod', 'set up agent architecture', or 'create agent team'. If harness-factory is installed, reads its SKILL.md and executes it; otherwise shows an install guide.
Run a task as a supervised verification loop — 3 gates (Pass/Fail, Quantitative, Qualitative), iterate until they pass, escalate at autonomy boundaries
Reads CLAUDE.md and .claude/rules/* for a project and evaluates quality via LLM-judgment: length, internal contradictions, ambiguities, placeholder content, progressive disclosure, sensitive file protection. Returns CONTEXT_REPORT JSON. Read-only.
Audits a project's automation and verification posture: test infrastructure, formatter/linter PostToolUse hooks, PreToolUse dangerous-action blocks, verifier-agent separation, and whether registered project skills/hooks are actually invoked in recent sessions. Returns AUTOMATION_REPORT JSON.
Analyzes Claude Code session JSONL files to extract execution patterns: plan-ratio, delegation, parallel usage, handoff, repeated n-grams, tool frequency. Uses jq for efficient extraction of tool_use metadata only — never reads prompt text. Scales via split+parallel when many sessions exist. Returns SESSION_REPORT JSON.
Analyzes installed skills/plugins/MCP servers against ~/.claude.json to detect dead, ghost, duplicate, and collision-prone entries AND report the current accessible state (enabled plugins, runtime skills, connected MCP servers). Returns a structured JSON report. Read-only. Use from /check-harness.
Analyze the user's task and propose the optimal orchestration pattern, then execute it. 4 patterns: Sequential Pipeline, Parallel Subagent, Team Mode, Ralph Loop. Situation-aware pattern selection with user confirmation before execution. Use when: "/agent-orchestrate", "agent-orchestrate", "orchestration", "which pattern", "run in parallel", "run sequentially", "team mode", "agent pattern", "suggest execution pattern", "how should we run this", "pick a pattern". Also trigger when the user describes a complex multi-step task that would clearly benefit from agent coordination — e.g., "analyze companies A, B, C", "design, implement, and review", "do these in order", "run 3 at once", or any task with 3+ subtasks where choosing the right execution pattern matters for efficiency.
Harness maturity diagnosis — evaluates the harness cycle (Scaffolding → Context → Planning → Execution → Verification → Compounding) using a **6-axis 24-item checklist** and a **2×3 analysis matrix** (Static/Behavioral/Growth × User/Project). All judgments stem from the gap between "what is set up (Static) ↔ what is actually done (Behavioral)" or "whether the harness is growing (Growth)". Runs 4 subagents in parallel (skill-portfolio-analyzer, session-pattern-analyzer, context-quality-reviewer, project-automation-auditor). session-pattern-analyzer is run twice — once for User global scope and once for the current project — to separate User/Project scopes. Use whenever the user asks to audit their Claude Code harness, review skill portfolio health, evaluate execution patterns across sessions, check project context/rules quality, or wants to know what's missing in their AI setup — even if they don't say "check-harness" explicitly. Trigger: "/check-harness", "check harness", "harness check", "harness audit", "settings check", "what's missing", "harness diagnosis", "maturity check", "my claude setup", "skill cleanup".
Use this skill when the user wants to audit the memory and documents Claude Code loads into context — CLAUDE.md (user global + project + nested), MEMORY.md, @imports, .claude/skills, .claude/agents, .claude/commands, installed plugins — and detect three kinds of issues: outdated claims, mutually contradictory statements, and risky-or-ambiguous wording. Produces a prioritized improvement list at `.drift-reports/`. Zero config. Trigger phrases: "doc drift", "memory drift", "memory audit", "context drift", "docs audit", "document review", "document audit", "memory check", "outdated docs", "document conflict".
Run a task as a supervised verification loop instead of a one-shot prompt. Establishes a loop contract (3 gates: Pass/Fail, Quantitative, Qualitative), then iterates Work → Verify → Fix until every gate passes — emitting an objective evidence report before declaring done. Stops and escalates to a human when an autonomy boundary is crossed (schema change, data-loss migration, auth/payment/security, or a change that conflicts with the spec). Implements the "Ralph loop" technique — the iterative-refinement pattern that agent-orchestrate selects as its Loop pattern. Use when: "/loop", "loop", "run until it passes", "iterate until tests pass", "supervise this until done", "loop.md", "verification loop", "ralph loop", "don't stop until the gates pass", "keep going until criteria met".
Systematically QA test any application — web apps, native macOS apps, Electron apps, CLI tools, interactive REPLs, or anything on screen. Three modes: browser (agent-browser/Playwright, fast, DOM-level), computer (MCP computer-use, screenshot + pixel clicks, any app), and cli (tmux, send-keys + capture-pane for interactive terminals). Auto-selects mode or accepts --browser / --computer / --cli override. Use when asked to "qa", "QA", "test this site", "test this app", "find bugs", "test and fix", "fix what's broken", "dogfood", "exploratory test", "bug hunt", "QA this app", "site test", "app test", "browser QA", "screen test", "native app test". Three tiers: Quick (critical/high only), Standard (+ medium), Exhaustive (+ cosmetic). Produces before/after health scores, fix evidence, and a ship-readiness summary.
Uses power tools
Uses Bash, Write, or Edit tools
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
A Claude Code plugin for Harness Engineering.
The art of designing environments where AI agents work well — 9 skills covering the full harness lifecycle: diagnose, plan, build, test, and maintain.
| Skill | Description | Command |
|---|---|---|
| check-harness | Diagnose harness maturity via 6-axis / 24-item checklist + 2×3 matrix (Static/Behavioral/Growth × User/Project). Runs 4 parallel subagents. | /harness-ops:check-harness |
| scaffold | Interview-driven greenfield scaffolding — code structure, test infra, guard rails, and CLAUDE.md with domain context | /harness-ops:scaffold |
| specify | Turn a goal into a structured implementation plan: L0 Goal → L1 Context → L2 Decisions → L3 Requirements → L4 Tasks (spec.md) | /harness-ops:specify "goal" |
| requirements-interview | Socratic requirements interview — clarifies ambiguous goals through structured questioning | /harness-ops:requirements-interview "topic" |
| qa | Systematically QA test any app — auto-selects browser / computer / CLI mode, produces before/after health score and fix report | /harness-ops:qa [target] |
| context-audit | Audit all context documents Claude loads (CLAUDE.md, MEMORY.md, skills, agents, plugins) for outdated claims, contradictions, and risky wording | /harness-ops:context-audit |
| agent-orchestrate | Analyze a task and execute the optimal orchestration pattern (sequential / parallel / team / ralph-loop) | /harness-ops:agent-orchestrate "task" |
| loop | Run a task as a supervised verification loop — establishes a 3-gate contract (Pass/Fail · Quantitative · Qualitative), iterates Work→Verify→Fix until gates pass, emits an evidence report, and escalates at autonomy boundaries (schema / migration / auth / payment / spec conflict) | /harness-ops:loop "task" |
| worktree | Create / list / remove isolated git worktrees so you can run independent parallel Claude Code sessions on separate branches without interference | /harness-ops:worktree [create|list|remove] |
| generate-team | Design and build an agent team architecture — delegates to the harness-factory plugin (must be installed separately) | /harness-ops:generate-team [description] |
Four specialized subagents used internally by check-harness:
| Agent | Role |
|---|---|
skill-portfolio-analyzer | Evaluates installed skills coverage and gaps |
session-pattern-analyzer | Analyzes execution patterns from session history |
context-quality-reviewer | Reviews CLAUDE.md and context document quality |
project-automation-auditor | Audits hooks, automations, and workflow integrations |
# 1. Add the harness marketplace (one-time, run from repo root)
claude plugin marketplace add .
# 2. Install the plugin globally
claude plugin install harness-ops@harness-ops-marketplace
# Diagnose your current project's harness
/harness-ops:check-harness
# Scaffold a new project with AI-optimized structure
/harness-ops:scaffold
# Clarify unclear requirements through structured interview
/harness-ops:requirements-interview "I'm not sure what to build"
# Turn a goal into a full implementation plan
/harness-ops:specify "implement user authentication"
# QA test a running app
/harness-ops:qa http://localhost:3000
# Check if your context docs are stale or contradictory
/harness-ops:context-audit
The repo ships .gemini/commands/harness-ops/*.toml files — Gemini CLI reads the harness-ops/ subdirectory name as the namespace prefix, registering skills as /harness-ops:skill-name.
/harness-ops:check-harness
/harness-ops:specify "implement user authentication"
/harness-ops:qa http://localhost:3000
/harness-ops:scaffold
/harness-ops:requirements-interview "topic"
/harness-ops:context-audit
/harness-ops:agent-orchestrate "task"
/harness-ops:loop "task"
/harness-ops:worktree create feature/new-task
# Symlink the harness-ops commands into the Gemini global commands directory
ln -s /path/to/harness-ops/.gemini/commands/harness-ops ~/.gemini/commands/harness-ops
After linking, open any Gemini CLI session and type /harness-ops to see all 9 skills.
| Claude Code | Gemini CLI | |
|---|---|---|
| Install | claude plugin install harness-ops | Symlink .gemini/commands/harness-ops/ |
| Invoke | /harness-ops:check-harness | /harness-ops:check-harness |
| Command definitions | commands/*.md | .gemini/commands/harness-ops/*.toml |
| Skill logic | skills/{name}/SKILL.md | same file (embedded in toml) |
npx claudepluginhub hyunho058/harness-ops --plugin harness-opsAgent team factory plugin — generates multi-agent team architecture from project codebase
Harness-native ECC operator layer - 67 agents, 271 skills, 92 legacy command shims, reusable hooks, rules, selective install profiles, and production-ready workflows for Claude Code, Codex, OpenCode, Cursor, and related agent harnesses
v9.44.1 — Patch release for Gemini environment/version detection and qwen auth gating. Run /octo:setup.
Superpowers Plus core skills library for Claude Code: planning, execution routing, TDD, debugging, and collaboration workflows
Unity Development Toolkit - Expert agents for scripting/refactoring/optimization, script templates, and Agent Skills for Unity C# development
Tools to maintain and improve CLAUDE.md files - audit quality, capture session learnings, and keep project memory current.
Comprehensive feature development workflow with specialized agents for codebase exploration, architecture design, and quality review