From gauntlet
Author Gauntlet Trial YAMLs from a product spec. Use this skill when the user wants to translate a verbose product specification, design doc, or natural-language description of an HTTP service into testable invariants packaged as Gauntlet Trials. Triggers include "author trials from this spec", "generate gauntlet trials", "propose trials for this API", "make trials from this design doc", "what should we test about this service".
How this skill is triggered — by the user, by Claude, or both
Slash command
/gauntlet:gauntlet-authorThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
This skill takes a product spec — markdown, plain text, or a path to one — and produces a `.gauntlet/trials/` directory of Trial YAMLs ready to feed the Gauntlet hardening loop. The skill is the translation layer between "what the system is supposed to do" (a human-authored spec) and "what externally observable invariants we will check" (Gauntlet Trials).
This skill takes a product spec — markdown, plain text, or a path to one — and produces a .gauntlet/trials/ directory of Trial YAMLs ready to feed the Gauntlet hardening loop. The skill is the translation layer between "what the system is supposed to do" (a human-authored spec) and "what externally observable invariants we will check" (Gauntlet Trials).
You are reasoning about invariants, not endpoints. The Trial describes the property that must hold; the host's Attacker subagent will figure out how to exercise the API surface.
Each Trial has two halves:
description — the attack surface, given to the Attacker. Phrased as "the API enforces X." The Attacker uses this to compose probes.blockers — the falsifiable acceptance criteria, withheld from the Attacker and given only to the HoldoutEvaluator. Phrased as "A request that does X is rejected with status 403."If both halves describe the same thing, the holdout adds no information beyond what the Attacker already has. They must be different angles on the same invariant. The description is a generality; each blocker is a concrete, observable check.
If you write a description that gives away the blockers, you have collapsed the train/test split before the run even starts. Re-write.
The host gives you one of:
Read),Read).If the spec is missing or empty, surface that and stop. Do not invent invariants from no source.
Read the spec end-to-end before writing anything. Look for properties that:
Common invariant categories — use this as a checklist, not a script:
| Category | Example invariants |
|---|---|
| Authorization | "Users cannot read or modify other users' resources." |
| Ownership | "A resource is mutable only by its creator." |
| Input validation | "Required fields are required. Type mismatches are rejected." |
| State transitions | "A completed order cannot be cancelled. A draft cannot be billed." |
| Read isolation | "Listing resources returns only those the caller can read." |
| Idempotency | "Re-submitting the same operation does not duplicate side effects." |
| Error boundaries | "Internal failures surface as 5xx with a stable error shape, not 200." |
If the spec mentions a specific failure mode ("must reject malformed JSON with 400"), that is a blocker, not an invariant on its own — fold it into the relevant category.
If you find five candidate invariants, prefer five over fifty. Each trial adds 4+ iteration cycles of cost. A small, sharp arsenal that surfaces the first wave of regressions is more valuable than a sprawling one that exhausts the budget on shallow checks.
Authoring location: .gauntlet/trials/<slug>.yaml. One file per trial. Slug is snake_case derived from the invariant.
Required fields, per the Trial Pydantic model in gauntlet/models.py:
id: resource_ownership_write_isolation # snake_case, stable across runs
title: Users cannot modify each other's resources
description: >
The API must enforce resource ownership across every write endpoint
(PATCH, PUT, DELETE). A user who did not create a resource must not be
able to mutate it through any documented route, including bulk operations.
blockers:
- A PATCH request by a non-owner is rejected with 403 or 404
- A DELETE request by a non-owner is rejected with 403 or 404
- The resource body is unchanged after an unauthorized write attempt
- A GET by the owner after a failed unauthorized write returns the original data
Authoring rules:
id is snake_case, required, and stable. The risk-report assembler keys findings by id; if you rename the id between runs, history breaks.title is human-readable, sentence-case, no period at the end.description is what the Attacker sees. It describes the surface and the invariant — not the acceptance criteria.blocker is a single falsifiable statement with an expected status code where applicable. "Returns 403" is testable; "is secure" is not.Self-checks before saving each trial:
Return to the user (or orchestrator):
.gauntlet/trials/. If a trial with the same id is already there, surface the conflict and let the user decide; don't overwrite.npx claudepluginhub coilyco-flight-deck/gauntlet --plugin gauntletProvides CDSS development patterns for drug interaction checking, dose validation, clinical scoring (NEWS2, qSOFA), and alert classification integrated into EMR workflows.