From antigravity-awesome-skills
Verifies AI agent completion claims by auditing git commits with the DOS kernel CLI. Confirms that commit messages match their actual diffs, replacing self-report with git-based evidence.
How this skill is triggered — by the user, by Claude, or both
Slash command
/antigravity-awesome-skills:dos-verify-done-claimsThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
When an AI agent says "done", "shipped", or "fixed", that is a **claim**, not a
When an AI agent says "done", "shipped", or "fixed", that is a claim, not a
fact — and a claim the agent checks by re-reading its own work is consistency,
not grounding. This skill replaces that self-report with a verdict from a
witness the agent did not author: it shells the DOS kernel (dos verify,
dos commit-audit) to confirm the claimed effect from git ancestry and the
commit's actual diff. DOS is deterministic — no API key, no LLM. The verdict is
git-only and offline as used here; the one exception is dos verify in a
workspace that wires a CI oracle, which --no-ci suppresses (see Security &
Safety Notes).
This skill adapts the DOS reference "witness-claim" pattern
(anthony-chaudhary/dos-kernel) into a host-agnostic screenplay.
fix: that only touched a README, or a "tests pass" that deleted the
assertions).pip install dos-kernel # provides the `dos` CLI; deterministic, no key
A commit subject is forgeable (whoever wrote the message authored it); the files
it touched are not (git did). dos commit-audit grades the subject against the
actual diff:
dos commit-audit --workspace . HEAD --json
commit-audit --json prints a JSON array of audited commits (one element
even for a single HEAD), so read verdict from the first element — e.g.
dos commit-audit --workspace . HEAD --json | jq -r '.[0].verdict'. (Without
--json the same verdict prints as a one-line text row: · OK …,
⚑ UNWITNESSED …, or · abstain ….) The verdicts are: OK (the diff backs the
claim's kind), CLAIM_UNWITNESSED (the subject's claim is not evidenced by the
diff — treat the "done" as unproven), or ABSTAIN. This judges the kind of
change, never correctness — run the tests for that.
If the agent claims a specific plan/phase landed, confirm it from git history rather than the transcript:
dos verify --workspace . PLAN PHASE --json --no-ci
--no-ci keeps the verdict git-only (see the Security note below). With --json
you get the shipped and source fields. (The default text form prints
SHIPPED PLAN PHASE (via grep) or NOT_SHIPPED PLAN PHASE (via none) — the same
verdict, and the process exit code is non-zero when not shipped.)
Grade shipped: true by the source, because git fallback grades itself by
forgeability — and forgeable evidence is exactly what this skill exists to
distrust:
registry or grep-artifact — non-forgeable (a registry row, or an
artefact/diff rung). This closes the claim.grep-subject (or bare grep) — forgeable: a commit subject or body
carried the phase token, which an agent can write without doing the work (even
on an empty commit). Treat this as shipped-per-the-subject, not confirmed —
corroborate it (run dos commit-audit on that commit, below) before you close.none — no positive evidence; accept as "not shipped", not as a tool failure.Accept the agent's "done" only when Step 2/3 corroborate it. If
CLAIM_UNWITNESSED or shipped: false, the work is not done regardless of how
confidently the agent narrated it — send it back.
# The agent committed and said it's fixed. Check the diff backs the claim.
# commit-audit --json returns an array, so read the first element's verdict:
dos commit-audit --workspace . HEAD --json | jq -r '.[0].verdict'
# OK -> the change is of the claimed kind; now run the tests
# CLAIM_UNWITNESSED -> the commit doesn't do what it says; reject
dos verify --workspace . AUTH AUTH2 --json --no-ci
# shipped: true, source: registry|grep-artifact -> non-forgeable; safe to close
# shipped: true, source: grep-subject|grep -> forgeable subject/body match;
# shipped-per-the-subject only -> corroborate with commit-audit before closing
# shipped: false, source: none -> no evidence; keep the ticket open
dos commit-audit HEAD immediately after every agent commit.source: none / CLAIM_UNWITNESSED as "not done", not as a tool error.source (registry, grep-artifact).
Treat grep-subject / bare grep as forgeable (an agent can write the subject
text) — corroborate before closing.dos verify reads git history; in a repo with no commits there is nothing to witness (it will honestly report source: none).dos CLI) are missing.pip install dos-kernel and the read-only
dos verbs (dos commit-audit, dos verify). These verbs never mutate
the repo or push. dos commit-audit only reads git history and the working
tree (no network). dos verify is also git-only unless the workspace has
wired a CI oracle ([verify] non_git_oracle in its dos.toml), in which case
it may shell a network check (e.g. gh api) for the verdict — pass --no-ci
(as the examples above do) to force the git-only path and guarantee no network.pip install dos-kernel installs from PyPI. The distribution name is
dos-kernel (the bare dos on PyPI is an unrelated package — do not install
it). Pin a version in locked environments.--workspace . argument
scopes every verdict to that repo.dos verify returns source: none and it looks like a failure.
Solution: That is the honest "no evidence" verdict — it means the phase has
no ship commit, so the claim is unproven. Re-stamp the real commit or keep the
task open.dos-kernel, not dos.dos-witness-claim, dos-goal-gate)
in anthony-chaudhary/dos-kernel cover the multi-agent fan-out and
self-stopping-agent variants of this same witness discipline.npx claudepluginhub sickn33/antigravity-awesome-skills --plugin antigravity-bundle-aas-mobile-app-builderVerifies claims about tests, builds, verification, or code quality using bash/git commands like status, diff, ls, cat for concrete evidence in dev workflows.
Validates AI agent claims against evidence trail in coding workflows. Catches unsubstantiated 'done', 'tests pass', 'fixed' without proof like outputs, diffs, or logs. Auto-triggers on completion keywords.
Enforces evidence-based verification by running fresh tests, builds, linters, reviewing outputs before claiming work done, committing, or PRing.