From guild
Antagonist panel coordinator for the guild-* substrate family. Composes guild-spawn to run evaluator agents in parallel, parses each evaluator's verdict, and aggregates findings into a single panel verdict (approved | flagged | flagged-conflict) with blocking/advisory split and conflict surfacing. Internal substrate skill — composed by ev-loop and any other loop style that wants antagonist panel coordination. Does not iterate, does not auto- resolve conflicts.
How this skill is triggered — by the user, by Claude, or both
Slash command
/guild:guild-validate agents=<comma-separated names> packet=<text> [precedence=<comma-separated names>]agents=<comma-separated names> packet=<text> [precedence=<comma-separated names>]This skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Antagonist panel coordinator. Spawns one or more evaluator agents in
Antagonist panel coordinator. Spawns one or more evaluator agents in parallel against the same evaluation packet, parses each verdict, and aggregates findings into a single panel verdict that the calling loop can act on.
This skill is the second primitive in the guild-* substrate family.
It composes guild-spawn rather than calling the Agent tool
directly — the layering is the point. A loop using guild-validate
gets parallel agent invocation for free, structured aggregation, and
forward-compatible verdict shape, without re-implementing any of it.
guild-validate does not know about specific evaluator rubrics. It
expects each spawned agent to return a parseable verdict (see the
contract in .claude/agents/evaluator-base.md) and treats every
agent's output uniformly.
agents — comma-separated subagent_type names of evaluators to
spawn, e.g. evaluator-contract-fit (single-evaluator panel) or
evaluator-contract-fit,evaluator-a11y,evaluator-tokens (multi-
evaluator panel). Order is preserved in the output's per-evaluator
fields.packet — the standard three-section evaluation packet (Contract /
Artifact / Original ask). This is the shared brief handed to every
evaluator. The packet shape is documented in evaluator-base.md.precedence (optional) — comma-separated ordered list of evaluator
names for non-conflicting overlap resolution. When two evaluators
flag the same issue with non-contradictory remedies, the higher-
precedence one's finding is reported in the consolidated list.
Defaults to input order when absent.The output shape is fixed even when v1 will populate some fields empty — downstream callers (loops, retry policies, PR builders) depend on the shape:
{
"verdict": "approved" | "flagged" | "flagged-conflict",
"blocking_findings": [
{
"evaluator": "<agent name>",
"code": "<flag code, e.g. criterion-unmet>",
"evidence": "<one-line evidence>",
"remedy": "<minimal concrete fix>"
},
...
],
"advisory_findings": [ ...same shape... ],
"cli_runs": [
{ "evaluator": "<agent name>", "command": "<cmd>", "passed": true | false }
],
"conflicts": [
{
"scope": "<file:line or shared scope>",
"evaluators": ["<a>", "<b>"],
"findings": [ ...the conflicting finding objects... ]
}
]
}
conflicts is only non-empty when verdict is flagged-conflict.
Validate inputs.
agents is empty, refuse with guild-validate-error: empty agents list.packet is missing or empty, refuse with guild-validate-error: missing packet.evaluator. The
substrate doesn't enforce this; if a non-evaluator is passed,
guild-spawn will spawn it and guild-validate will attempt to
parse a verdict from its output. A non-evaluator's output won't
parse, which surfaces as a parse-failure in the aggregated
output. Caller is responsible for choosing role-appropriate agents.Spawn evaluators in parallel. Invoke guild-spawn via the
Skill tool with agents=<input agents> and brief=<input packet>.
Do not pass per_agent_context — every evaluator sees the same packet
by design. (A future caller wanting domain-specific hints could spawn
evaluators with per-agent context; this skill does not.)
Parse and aggregate. Build a JSON array of {agent, output}
entries from guild-spawn's outputs and pipe it to
bin/guild parse-and-aggregate via stdin. The verb returns the
locked Result shape ({verdict, blocking_findings, advisory_findings, cli_runs, conflicts}) on stdout. Bash invocation
uses a quoted heredoc so JSON content passes through verbatim:
guild parse-and-aggregate <<'GUILD_INPUT'
[
{"agent": "evaluator-contract-fit", "output": "...full evaluator output..."}
]
GUILD_INPUT
The verb is the implementation; the rules below are the spec it honors:
VERDICT:. approved → no findings.
flagged → extract the Reasons section bullets as findings, plus
the optional Suggested remedies section (paired with reasons by
index). Missing or unparseable VERDICT line → one parse-failure
blocking finding.blocking by default. An
explicit BLOCKING: or ADVISORY: prefix on the reason line
overrides. (Today's evaluators emit unprefixed reasons; Phase 2
evaluators with antipattern catalogs will emit explicit prefixes.){evaluator, code, evidence, remedy}.
code defaults to criterion-unmet; if the evidence text starts
with <word>(-<word>)*: ... (kebab code prefix, optionally
backtick-wrapped, optionally with parenthetical context), the
prefix becomes code and the rest becomes evidence.## CLI runs section, so
cli_runs stays empty. Forward-compat infrastructure for Phase 2.conflicts non-empty → flagged-conflict;
else blocking_findings non-empty → flagged; else approved.
Advisory-only is still approved — advisories surface but do not
gate.Return the script's output (parsed back into a structured value) to the caller. This skill performs no further work.
A conflict is two evaluators emitting blocking findings on the same scope (file path, file:line, or named symbol) with incompatible remedies — that is, applying both remedies would leave the artifact broken.
In v1, conflict detection is a documented no-op:
[evaluator-contract-fit] as
used by ev-loop in unit 4), only one evaluator can fire, so
conflicts cannot exist by definition. Return conflicts: [].Until then, flagged-conflict is a verdict the API supports but v1
does not produce. Loops can write conflict-handling code that's ready
when it does.
| Findings present | Verdict |
|---|---|
| None of any kind | approved |
| Advisory only (no blocking, no conflicts) | approved (advisories preserved) |
| Blocking, no conflicts | flagged |
| Conflicts detected | flagged-conflict |
flagged, the caller
decides whether to re-spawn after the generator addresses the
remedies. This skill returns one verdict per call.flagged-conflict fires, the
conflicts list surfaces unaltered. The calling loop decides whether
to surface to the human, auto-pick by precedence, or whatever the
style layer prefers.guild-spawn
verbatim and is handed to every evaluator unchanged.guild-spawn. Do not call the Agent tool directly.
The layering is the point.agents list → guild-validate-error: empty agents list. Stop.packet → guild-validate-error: missing packet. Stop.guild-spawn itself errors (e.g. invalid agent name) → forward the
error verbatim to the caller. Do not partially aggregate.parse-failure blocking findings for those evaluators; aggregate
the rest normally; verdict is at minimum flagged.verdict: flagged with
every entry in blocking_findings carrying a parse-failure code.
The caller distinguishes "everyone said it's broken" from "the
panel itself can't read what was returned."npx claudepluginhub krambuhl/agents --plugin guildCreates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.