From harness
Configures a harness. A meta-skill that defines specialized agents and generates the skills they will use. (1) Used when requesting 'build a harness' or 'set up a harness', (2) when requesting 'harness design' or 'harness engineering', (3) when building a harness-based automation system for a new domain/project, (4) when reorganizing or expanding a harness configuration, (5) when requesting operations/maintenance of an existing harness such as 'check harness', 'audit harness', 'harness status', or 'sync agents/skills'.
How this skill is triggered — by the user, by Claude, or both
Slash command
/harness:harnessThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
A meta-skill that configures a harness tailored to the domain/project, defines the roles of each agent, and generates the skills they will use.
A meta-skill that configures a harness tailored to the domain/project, defines the roles of each agent, and generates the skills they will use.
Core Principles:
.claude/agents/) and skills (.claude/skills/).When the harness skill is triggered, the first step is to check the current status of the existing harness.
Read project/.claude/agents/, project/.claude/skills/, and project/CLAUDE.md
Determine execution mode based on the current state:
Phase Selection Matrix for Existing Expansion:
| Change Type | Phase 1 | Phase 2 | Phase 3 | Phase 4 | Phase 5 | Phase 6 |
|---|---|---|---|---|---|---|
| Add Agent | Skip (Use Phase 0 results) | Placement decision only | Required (incl. 3-0) | If dedicated skill is needed (incl. 4-0) | Modify orchestrator | Required |
| Add/Modify Skill | Skip | Skip | Skip | Required (incl. 4-0) | If connection changes | Required |
| Change Architecture | Skip | Required | Affected agents only (incl. 3-0) | Affected skills only (incl. 4-0) | Required | Required |
Cross-reference the existing agent/skill list with the CLAUDE.md records to detect any drift
Summarize the audit results for the user and get confirmation on the execution plan
Agent Teams is the top default priority. Always evaluate Agent Teams first when 2 or more agents collaborate. Team members self-coordinate using direct communication (SendMessage) and a shared task list (TaskCreate), which improves output quality through sharing findings, debating conflicts, and filling gaps.
| Mode | When to Use | Characteristics |
|---|---|---|
| Agent Teams (Default) | Collaboration of 2+ agents, real-time coordination/feedback exchange needed, mutual reference to intermediate outputs | Self-coordinate via TeamCreate + SendMessage + TaskCreate |
| Sub-agents (Alternative) | Single agent task, returning results to the main agent is sufficient, team communication overhead is excessive | Direct invocation of the Agent tool, parallelized with run_in_background |
| Hybrid | Different characteristics per Phase — e.g., parallel collection (Sub) → consensus-based integration (Team) | Mixed configuration of Team/Sub mode per Phase |
Decision Order:
For detailed comparison and decision trees by pattern, see "Execution Mode" in
references/agent-design-patterns.md.
references/agent-design-patterns.md)
Decide based on four axes: expertise, parallelism, context size, and reusability. For details, see "Agent Separation Criteria" in references/agent-design-patterns.md. Duplication and reuse reviews for existing agents are covered in Phase 3-0.
Before creating a new agent, check for duplication with existing agents in project/.claude/agents/. Re-building harnesses repeatedly can lead to redundant agents accumulating under different names.
See "Agent Reuse Design" in
references/agent-design-patterns.mdfor duplication classification and reuse design.
All agents must be defined as project/.claude/agents/{name}.md files. Direct prompt definitions inside the Agent tool are strictly prohibited. Reasons:
If using built-in types (general-purpose, Explore, Plan), still create the agent definition file. Specify the built-in type in the subagent_type parameter of the Agent tool, and include roles, principles, and protocols in the agent definition file.
Model Setting: All agents must use model: "opus". Always specify the model: "opus" parameter when invoking the Agent tool. The quality of a harness is directly tied to the reasoning ability of its agents, and opus guarantees the highest quality.
Team Reorganization: Only one agent team can be active per session, but the team can be dismantled and reorganized between Phases. If different specialist combinations are needed per Phase (e.g., in a Pipeline pattern), save intermediate outputs as files, delete the previous team, and create a new one.
Define each agent in project/.claude/agents/{name}.md. Required sections: Core Role, Operating Principles, Input/Output Protocols, Error Handling, Collaboration. In Agent Teams mode, add a ## Team Communication Protocol section to specify message recipients/senders and task request scopes.
See "Agent Definition Structure" in
references/agent-design-patterns.md+references/team-examples.mdfor definition templates and actual files.
Mandatory requirements when including a QA agent:
general-purpose type (Explore is read-only and cannot execute verification scripts)references/qa-agent-guide.md for the detailed guideCreate the skills to be used by each agent at project/.claude/skills/{name}/SKILL.md. See references/skill-writing-guide.md for a detailed writing guide.
Before creating a new skill, check for duplication with existing skills in project/.claude/skills/. Re-building harnesses repeatedly can lead to redundant skills accumulating under different names.
See "Skill Reuse Design" in
references/skill-writing-guide.mdfor duplication classification and generalization patterns.
skill-name/
├── SKILL.md (Required)
│ ├── YAML frontmatter (name, description required)
│ └── Markdown body
└── Bundled Resources (Optional)
├── scripts/ - Execution code for repetitive/deterministic tasks
├── references/ - Reference documents loaded conditionally
└── assets/ - Files used in output (templates, images, etc.)
The description is the skill's sole trigger mechanism. Since Claude tends to be conservative about triggering skills, write descriptions actively ("pushy").
Bad Example: "A skill that processes PDF documents"
Good Example: "Performs all PDF tasks including reading PDF files, extracting text/tables, merging, splitting, rotating, watermarking, encrypting/decrypting, and OCR. Must use this skill whenever a .pdf file is mentioned or a PDF output is requested."
Key: Describe both what the skill does and the specific trigger situations, distinguishing it from similar cases that should not trigger it.
| Principle | Description |
|---|---|
| Explain the Why | Instead of coercive instructions like "ALWAYS/NEVER", explain the reasoning. When LLMs understand the reasons, they make correct decisions even in edge cases. |
| Keep it Lean | The context window is a shared resource. Aim to keep the SKILL.md body under 500 lines. Delete lightweight details or move them to references/. |
| Generalize | Explain the underlying principles to cover diverse inputs rather than writing narrow rules tailored only to specific examples. Avoid overfitting. |
| Bundle Repetitive Code | If agents write identical scripts repeatedly during test runs, bundle them in scripts/ beforehand. |
| Use Imperative Style | Use an imperative/directive tone (e.g., "Do this", "Run that" instead of "You can do this"). |
Skills manage context using a 3-tier loading system:
| Tier | Loading Timing | Size Target |
|---|---|---|
| Metadata (name + description) | Always present in context | ~100 words |
| SKILL.md Body | When the skill triggers | <500 lines |
| references/ | On-demand only | Unlimited (scripts can execute without loading) |
Size Management Rules:
cloud-deploy/
├── SKILL.md (Workflow + Selection Guide)
└── references/
├── aws.md ← Load only when AWS is selected
├── gcp.md
└── azure.md
See
references/skill-writing-guide.mdfor detailed writing patterns, examples, and data schema standards.
An orchestrator is a special type of skill that coordinates the overall team by wiring individual agents and skills into a single workflow. While individual skills created in Phase 4 define "what and how each agent does", the orchestrator defines "who collaborates when and in what order". See references/orchestrator-template.md for templates.
Modifying Orchestrator during Existing Expansion: When expanding an existing harness instead of building a new one, modify the existing orchestrator instead of creating a new one. Reflect the new agent in the team composition, task allocation, and data flow, and add trigger keywords related to the new agent in the description.
The orchestrator pattern varies based on the execution mode selected in Phase 2-1:
Agent Teams Pattern (Default):
The orchestrator configures the team via TeamCreate and assigns tasks via TaskCreate. Team members coordinate using SendMessage to communicate directly. The leader (orchestrator) monitors progress and aggregates results.
[Orchestrator/Leader]
├── TeamCreate(team_name, members)
├── TaskCreate(tasks with dependencies)
├── Team members self-coordinate (SendMessage)
├── Collect and aggregate results
└── Clean up team
Sub-agents Pattern (Alternative):
The orchestrator invokes sub-agents directly using the Agent tool. Parallel execution is managed with run_in_background: true, and results are returned only to the main orchestrator. Used when team communication is unnecessary and overhead reduction is preferred.
[Orchestrator]
├── Agent(agent-1, run_in_background=true)
├── Agent(agent-2, run_in_background=true)
├── Wait and collect results
└── Generate integrated output
Hybrid Pattern: Mix different modes per Phase. Common combinations:
TeamDelete and new TeamCreate for each Phase, with sub-agent invocations in betweenIf Hybrid is chosen, state the execution mode at the top of each Phase section in the orchestrator (e.g., **Execution Mode:** Agent Teams).
Specify data passing methods between agents in the orchestrator:
| Strategy | Method | Applicable Mode | Suitable Cases |
|---|---|---|---|
| Message-based | Direct communication via SendMessage | Team | Real-time coordination, feedback exchange, lightweight state transfer |
| Task-based | Share task status via TaskCreate/TaskUpdate | Team | Progress tracking, dependency management, task requests |
| File-based | Read/write files at agreed paths | Team + Sub | Large data volumes, structured outputs, audit trails required |
| Return-based | Return messages from the Agent tool | Sub | Orchestrator directly collecting sub-agent results |
Recommended Combination (Team mode): Task-based (coordination) + File-based (artifacts) + Message-based (real-time communication) Recommended Combination (Sub mode): Return-based (result collection) + File-based (large artifacts) Hybrid: Apply matching combinations based on the execution mode of each Phase.
Rules for File-based passing:
_workspace/ folder under the working directory to store intermediate outputs{phase}_{agent}_{artifact}.{ext} (e.g., 01_analyst_requirements.md)_workspace/ (for post-verification and audit trails)Include error handling policies in the orchestrator. Core principles: Retry once; if it fails again, proceed without the result (clarifying the omission in the report); do not delete conflicting data, but write it alongside its source.
See "Error Handling" in
references/orchestrator-template.mdfor a strategy table by error type and implementation details.
| Task Scale | Recommended Team Size | Tasks per Member |
|---|---|---|
| Small scale (5~10 tasks) | 2~3 members | 3~5 tasks |
| Medium scale (10~20 tasks) | 3~5 members | 4~6 tasks |
| Large scale (20+ tasks) | 5~7 members | 4~5 tasks |
Larger teams increase coordination overhead. A focused team of 3 is better than a distracted team of 5.
After completing the harness configuration, register a minimal pointer in the project's CLAUDE.md. Since CLAUDE.md is loaded in every new session, recording the harness's presence and trigger rules is sufficient for the orchestrator skill to handle the rest.
CLAUDE.md Template:
## Harness: {Domain Name}
**Goal:** {One-line core goal of the harness}
**Trigger:** Use the `{orchestrator-skill-name}` skill for any `{domain}`-related task requests. Simple questions can be answered directly.
**Change History:**
| Date | Change Details | Target | Reason |
|:---|:---|:---|:---|
| {YYYY-MM-DD} | Initial Configuration | All | - |
Do not include in CLAUDE.md: Agent lists, skill lists, directory structures, or detailed execution rules. Reason: These are managed in the orchestrator skill, .claude/agents/, and .claude/skills/, so including them in CLAUDE.md creates redundancy. Directory structures can be checked directly in the file system. CLAUDE.md only contains the pointer (trigger rules) + change history.
The orchestrator must handle not only initial execution but also subsequent follow-up tasks. Guarantee the following three aspects:
1. Include follow-up keywords in the orchestrator description: Initial creation keywords alone will not trigger follow-up requests. Make sure to include the following follow-up expressions in the description:
2. Add context check step in Phase 1 of the orchestrator: Check the existence of existing artifacts at the start of the workflow to determine the execution mode:
_workspace/ exists + user requests partial modifications → Partial Re-run (re-invoke only the affected agent)_workspace/ exists + user provides new input → New Run (move the existing _workspace to _workspace_prev/)_workspace/ does not exist → Initial Run3. Include re-invocation instructions in agent definitions:
Explicitly state "actions when prior artifacts exist" in each agent .md file:
See "Phase 0: Context Check" in
references/orchestrator-template.mdfor the orchestrator template.
Validate the generated harness. See references/skill-testing-guide.md for detailed testing methodologies.
run_in_background settings, and return-value collection logicPerform actual execution tests for each generated skill:
scripts/ beforehand.Verify that each skill's description triggers correctly:
Near-miss Core: Obvious unrelated queries like "write a fibonacci function" have no test value. Focus on ambiguous boundary queries like "extract the charts from this excel file into PNGs" (xlsx skill vs image conversion).
Confirm that there are no trigger conflicts with existing skills at this stage.
## Test Scenarios section to the orchestrator skillA harness is not a static artifact created once and forgotten. It is a system that continuously evolves based on user feedback.
Request feedback from the user upon completion of each harness execution:
Proceed if there is no feedback. Do not force it, but always provide the opportunity.
Map different feedback types to their target modifications:
| Feedback Type | Modification Target | Example |
|---|---|---|
| Output Quality | Affected Agent's Skill | "Analysis is too shallow" → Add depth criteria to skill |
| Agent Roles | Agent Definition .md | "Security review is also needed" → Add a new agent |
| Workflow Sequence | Orchestrator Skill | "Verification should come first" → Change Phase sequence |
| Team Composition | Orchestrator + Agents | "These two can be merged" → Merge agents |
| Missing Trigger | Skill Description | "Does not work with this expression" → Expand description |
Record all modifications in the CLAUDE.md Change History table (identical to the table in Phase 5-4):
**Change History:**
| Date | Change Details | Target | Reason |
|:---|:---|:---|:---|
| 2026-04-05 | Initial Configuration | All | - |
| 2026-04-07 | Add QA Agent | agents/qa.md | Feedback on lack of output validation |
| 2026-04-10 | Add Tone Guide | skills/content-creator | "Too formal" feedback |
Track the direction of the harness's evolution through this history to prevent regressions.
Propose evolution not only when the user explicitly requests "modify harness," but also when:
Perform audits, modifications, and syncing of existing harnesses systematically. Enter this workflow when branching to "Operations/Maintenance" in Phase 0.
Step 1: Status Audit
.claude/agents/ with the agent configuration in the orchestrator skill → Generate a mismatch list.claude/skills/ with the skill configuration in the orchestrator skill → Generate a mismatch listStep 2: Incremental Additions/Modifications
Step 3: Update CLAUDE.md Change History
Step 4: Verify Changes
Verify upon completion:
project/.claude/agents/ — Agent definition files must be created (required even for built-in types)project/.claude/skills/ — Skill files (SKILL.md + references/)model: "opus" parameter explicitly stated in all Agent invocations.claude/commands/ — Nothing generated herereferences/agent-design-patterns.mdreferences/team-examples.mdreferences/orchestrator-template.mdreferences/skill-writing-guide.md — Writing patterns, examples, and data schema standardsreferences/skill-testing-guide.md — Testing/evaluation/iterative improvement methodologiesreferences/qa-agent-guide.md — Reference when including a QA agent in a build harness. Covers integration coherence verification, boundary bug patterns, and QA agent definition templates. Based on 7 real-world bug cases.Creates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.
npx claudepluginhub jerstadgeirivar-gmail/harness --plugin harness