From attack-challenge
Runs a two-round adversarial review of any output — plans, designs, specs, architecture decisions, feature proposals — attacking from six lenses and synthesizing the strongest version across all rounds for the user to approve. Use after finishing a significant, hard-to-reverse deliverable, or when the user says "attack my plan", "find weaknesses", or "tear this apart" before acting on it.
How this skill is triggered — by the user, by Claude, or both
Slash command
/attack-challenge:attack-challengeThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Two rounds of adversarial attack on any output, followed by a comparative synthesis that keeps the strongest elements from every version. The skill drives the attack and proposes the synthesis; the user gives final approval.
Two rounds of adversarial attack on any output, followed by a comparative synthesis that keeps the strongest elements from every version. The skill drives the attack and proposes the synthesis; the user gives final approval.
If the output doesn't exist yet, create it first, then start the review.
Before attacking, classify the target in a line or two. This focuses the attack and stops irrelevant lenses from firing:
High stakes + irreversible → Full. Low stakes + reversible → Light, or skip the review and just try it.
Attack from all six lenses. If a lens genuinely doesn't apply, say why in one line — don't skip it silently.
Ground the attack against the project's real constraints, not generic best practice. Before firing the lenses, read CLAUDE.md, the relevant specs/ADRs, and any stated requirements. An attack that ignores the actual technical contract is noise — cite the constraint each finding tests against.
Tag every finding with severity and confidence:
Confidence keeps speculative objections from blocking progress: a Low-confidence Minor is a footnote; a High-confidence Critical is a gate. Don't let a paranoid hypothetical wear the same weight as a certain breakage.
If Round 1 finds zero issues, say so explicitly and skip to the user checkpoint.
For each finding, do ONE of:
No finding may be silently dropped. For 5+ findings, track them in a table:
| Finding | Severity | Confidence | Action | Detail |
|---|---|---|---|---|
| [description] | Critical/Major/Minor | High/Med/Low | Fixed / Accepted / Deferred | [what changed, why accepted, or revisit trigger] |
For fewer findings, address them inline.
Persist every deferral. Sessions don't carry memory between runs, so a deferred finding that lives only in this transcript is a finding you've dropped. Append each Deferred item — and any Accepted risk worth revisiting — to a durable file (docs/technical-debt.md; create it if absent) with its severity, the revisit trigger, and today's date.
A model attacking its own work twice shares the same blind spots across both rounds. Open Round 2 in a different reviewer persona — and where the harness allows, route it to a different model — to break that correlation. Name the persona you're adopting (e.g. a skeptical SRE, a security reviewer, a cost-conscious staff engineer) and attack from there.
Attack the original AND revised versions side-by-side — not another list of fresh complaints.
For each significant aspect:
[Aspect]: Original said X. Revised said Y. Verdict: which is stronger and why — or neither, here's what's actually right.
For outputs without discrete aspects, compare holistically: what did the revision gain, what did it lose, what's the net?
Must also answer:
NEVER open Round 2 with praise. Attack first.
Pick the strongest version of each aspect across ALL versions — original, revised, and ideas surfaced during the attacks. This is NOT "tidy up the revised version." For any non-obvious choice, note why that version won.
Look hard for where the original beat the revision — a clean sweep is rarer than it feels. But if the revision genuinely dominated on every axis, say so plainly; don't manufacture a regression just to prove the original had merit.
Present the synthesized result with an explicit approve / reject prompt. After 2 rejections, stop iterating on details and discuss whether the fundamental approach needs to change.
CLAUDE.md and the real specs, not generic best practice.docs/technical-debt.md, it's dropped.Creates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.
npx claudepluginhub hawksfold/skills --plugin attack-challenge