Skill

lean-vacuity-redteam

Adversarially stress-test a Lean 4 formalization (or the lean-vacuity-check detector itself) for hidden vacuity. Pointed at a repo, it hunts for the way the headline is hollow: a smuggled hypothesis, an extra assumption the paper lacks, a conclusion quietly narrowed from the source, a trivial or aliased definition the statement rests on, or an assumed axiom — and confirms each against the facts brief and the paper. Use when the user asks to red-team / attack / find holes in a Lean proof or formalization, to stress-test the vacuity checker, or to argue how a "machine-verified" claim could be hollow.

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/lean-vacuity-check:lean-vacuity-redteam

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

**Think at maximum budget — this is adversarial, and your job is to BREAK the claim,

SKILL.md

84 lines · ~1.1k tokens

Stats

LanguagePython

Stars0

MaintenanceExcellent

Last CommitJun 4, 2026

Actions

View Source View Plugin View on GitHub View README

Stats

Actions

Lean vacuity red-team

Think at maximum budget — this is adversarial, and your job is to BREAK the claim, not to confirm it. Assume the formalization is weaker than advertised and find the exact way. A clean build and a green check are the starting assumption, not evidence.

Orchestrate. Optimise for the most exhaustive answer, not the fastest — token cost is not the constraint. When the harness supports it (Claude Code's Workflow tool / subagents), attack the axes below in parallel and have an independent skeptic try to refute each candidate hole before you report it; only confirmed attacks count.

Field-general. This works for any area of mathematics. Bring general mathematical reasoning to whatever the repo is about; do not assume complexity theory.

Two modes

(A) Attack a target formalization — given a repo, show how its headline theorem fails to establish the advertised claim, or argue (with evidence) that it holds.
(B) Attack our own checker — craft a genuinely vacuous proof whose hollowness the facts brief would not make obvious, or a genuinely sound proof a reader of the brief might wrongly flag, to harden the skill. Confirm each by running make_brief.py on it.

Recon: gather the facts first

python3 scripts/lean_axiom_probe.py <REPO> --auto --json-out results/axioms.json
python3 scripts/make_brief.py <REPO> --axioms results/axioms.json --out-dir audits/<repo>

Read audits/<repo>/BRIEF.md and the repo's README/paper (open the actual .tex/PDF). The brief gives you the kernel closure, the unfold chain of every headline identifier, and the paper-vs-Lean pairing. Attack from there.

Attack axes (prioritise the subtle ones)

Work the open-ended ones first; they are where real formalizations fail and where no fixed checker can catch you:

Weakening vs the paper (V6) — the highest-value attack. Diff the paper statement against the Lean. Is there a hypothesis in the Lean that the paper does not assume? Is the conclusion narrower — a special case, → where the paper proves ↔, ∀/∃ swapped, a restricted machine model, a missing polynomial/quantitative bound? Any added hypothesis or narrowed conclusion makes the Lean prove less than the paper. Quote both sides.
Assumed goal / smuggled crux (V5). Is the hard content moved into a premise — a conclusion-shaped hypothesis, an assumed Hard → Goal, a "bridge"/"oracle" assumption, or substantive hypotheses guarding a trivial conclusion? Is each premise a genuinely cited known result, or the theorem in disguise?
Triviality / aliasing (V3/V4). Follow the unfold chain in the brief. Does the conclusion reduce to True / x = x / ∈ Set.univ, or is a key term defined to make the statement hold (def NP := P, a class as Set.univ/Unit, an opaque term that cannot be refuted)?
Soundness escape hatches (V1/V2). Read the kernel closure: sorryAx, a custom axiom/constant carrying the crux, native_decide/ofReduceBool, unsafe, or set_option debug.skipKernelTC.

Confirm — don't cry wolf

Every claimed hole must be backed by a concrete fact: the exact line in the brief, the exact unfolded definition, or the exact paper-vs-Lean mismatch (quote both). A weak hypothesis that the paper also assumes is not an attack. If a candidate hole survives a careful read of the source, it counts; otherwise drop it.

Output

A ranked list of the strongest confirmed attacks, each with: the axis (V1–V6), the concrete evidence, the severity, and whether it alone makes the headline vacuous. If the formalization survives every axis, say so plainly and name the one fact that most supports the claim. For mode (B), report which crafted cases the facts brief exposed and which it missed, and what fact the brief should additionally surface.

lean-vacuity-redteam

Invocation

Context Preview

SKILL.md

lean-vacuity-redteam

Invocation

Context Preview

SKILL.md

Lean vacuity red-team

Two modes

Recon: gather the facts first

Attack axes (prioritise the subtle ones)

Confirm — don't cry wolf

Output

Similar Skills

Lean vacuity red-team

Two modes

Recon: gather the facts first

Attack axes (prioritise the subtle ones)

Confirm — don't cry wolf

Output

Similar Skills