Skill

audit-setup

Audit a coding agent's configuration for operating discipline against the Agent Discipline Standard. Use when asked to "audit my setup", "score my CLAUDE.md", "check my agent config", "how disciplined is my config", "valuta la mia config", "audit agent discipline", or to assess whether a CLAUDE.md / AGENTS.md / .cursorrules and its harness enforce good engineering practices. Produces a 0-100 score, per-criterion evidence, and concrete diffs.

Popularity

Stars

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/agent-discipline:audit-setup

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

You are auditing the **operating discipline** of a coding agent's configuration: whether it actually forces disciplined engineering, and whether the critical rules are *mechanically enforced* rather than just written down. You do **not** check syntax (stale model IDs, broken paths) — that is the job of `agnix`/`cclint`; say so if the user expects it.

SKILL.md

82 lines · ~1.3k tokens

Stats

Stars1

MaintenanceGood

Last CommitJun 11, 2026

Actions

View Source View Plugin View on GitHub View README

Stats

Actions

Audit Setup — Agent Discipline Standard

You are auditing the operating discipline of a coding agent's configuration: whether it actually forces disciplined engineering, and whether the critical rules are mechanically enforced rather than just written down. You do not check syntax (stale model IDs, broken paths) — that is the job of agnix/cclint; say so if the user expects it.

Step 0 — Load the standard (do not work from memory)

Read STANDARD.md — the single source of truth for all criteria, weights, and scoring rules. Look for it, in order:

The plugin/repository root (../../STANDARD.md relative to this skill).
Alongside this skill file (./STANDARD.md) — the standalone-install case.
If absent locally, fetch the canonical version from the repo's raw URL, pinned to the installed version.

Apply the criteria as written in STANDARD.md, not from this file. This skill is the procedure; the standard is the rubric. If the two ever disagree, the standard wins.

Step 1 — Locate the targets

Two things get audited:

Agent config (prose): CLAUDE.md (project and global), AGENTS.md, .cursorrules, system prompt, or equivalent. Read all that exist; merge them — a rule satisfied in the global config counts for the project too.
Harness config (enforcement): settings.json, settings.local.json, plugin hooks.json, managed settings — wherever hooks and permissions.deny live. This is required for the EN-* (Enforcement) criteria. If you cannot access it, score EN-* as 0 and say explicitly that enforcement could not be verified (do not guess).

State which files you found before scoring. If a config is missing entirely, that is itself the finding.

Step 2 — Score each criterion

Follow the two scoring kinds defined in STANDARD.md:

AD-* (behavioral): score = weight × (0.5 × presence + 0.5 × quality).
- Presence is 0/1 and near-deterministic: the rule must be explicitly written. Quote the exact line(s) and location as evidence. Implied or inferred intent = 0. Do not be generous — "I assume they'd verify" is a 0.
- Quality is judged only if presence = 1, against that criterion's "What good looks like" / "Red flags". A slogan with no protocol scores low.
EN-* (enforcement): walk the binary checklist against the harness config. Each covered item earns its fixed share. No model judgment — either the hook/deny entry exists or it doesn't. Cite the exact entry.

Stability discipline: the AD-* presence half and all of EN-* are deterministic — they must not vary between runs. Only the AD-* quality half carries judgment, and it is bounded per-criterion. If you find yourself about to award a wildly different score than a plain reading justifies, re-read the criterion. The headline number should be reproducible within a few points.

Step 3 — Compute and band

Sum the weighted criterion scores to a 0–100 total. Assign the band from STANDARD.md (🔴 Risky / 🟠 Workable / 🟡 Solid / 🟢 Battle-tested).

Step 4 — Report

Output in this shape:

# Agent Discipline Audit

**Score: <N>/100 — <band emoji> <band>**
Audited: <files found>

## Scores by category
| Criterion | Score | Evidence |
|---|---|---|
| AD-1 Anti-loop | 9/9 | "after 2 identical failures, stop" — global CLAUDE.md:14 |
| ...        | ... | ... |
| EN-1 Destructive guardrail | 3/5 | deny rm -rf, push --force; missing cloud-delete, branch -D |

## Top gaps — concrete diffs
For each low criterion, the smallest change that raises it:

### AD-13 Scope discipline — 0/7 → ~6/7
Add to CLAUDE.md:
> Stay within the requested scope. No unrequested features, no opportunistic
> refactors outside the task. Propose before expanding scope; prefer the
> smallest diff that solves it.

### EN-1 Destructive guardrail — 3/5 → 5/5
Add to settings.json `permissions.deny`:
> "Bash(git branch -D *)", "Bash(gcloud * delete*)", "Bash(firebase * delete*)"

Rules for the report:

Every score cites evidence. A presence/enforcement point with no quotable line is a bug — recheck it.
Diffs must be applicable, not advice. Give the text to paste, in the right file.
Lead with the gaps that move the score most (highest weight × largest shortfall), not the easiest ones.
Note the syntactic prerequisites: remind the user to also run agnix for the deterministic checks this standard deliberately skips.

Optional — apply the fixes

If the user asks you to apply the diffs (not just report), edit the config files directly, then re-run the audit to confirm the new score. Treat config edits as you would any change: small, reversible, and confirmed before destructive overwrites.

audit-setup

Popularity

Invocation

Context Preview

SKILL.md

audit-setup

Popularity

Invocation

Context Preview

SKILL.md

Audit Setup — Agent Discipline Standard

Step 0 — Load the standard (do not work from memory)

Step 1 — Locate the targets

Step 2 — Score each criterion

Step 3 — Compute and band

Step 4 — Report

Optional — apply the fixes

Similar Skills

Audit Setup — Agent Discipline Standard

Step 0 — Load the standard (do not work from memory)

Step 1 — Locate the targets

Step 2 — Score each criterion

Step 3 — Compute and band

Step 4 — Report

Optional — apply the fixes

Similar Skills