From vibeos
Phase boundary audit that runs all gates and all audit agents on the entire codebase, establishes quality baselines, and enforces ratcheting (finding count cannot increase between phases). Use when the user says "phase is done", "wrap up this phase", "milestone check", or when all work orders in a phase are complete.
How this skill is triggered — by the user, by Claude, or both
Slash command
/vibeos:checkpointThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Run all quality gates and all audit agents on the entire codebase at phase boundaries. Establish baselines, enforce quality ratcheting, and check for product/standards drift.
Run all quality gates and all audit agents on the entire codebase at phase boundaries. Establish baselines, enforce quality ratcheting, and check for product/standards drift.
Follow the full USER-COMMUNICATION-CONTRACT.md (docs/USER-COMMUNICATION-CONTRACT.md). Key rules:
Skill-specific addenda:
Before starting, verify these exist:
project-definition.jsondocs/planning/DEVELOPMENT-PLAN.mdIf no source code exists, report "No source code to checkpoint" and stop.
If $ARGUMENTS specifies a phase number, use that. Otherwise:
docs/planning/DEVELOPMENT-PLAN.mdRun the full gate suite on the entire codebase:
bash ".vibeos/scripts/gate-runner.sh" pre_commit --project-dir "${CLAUDE_PROJECT_DIR:-.}"
Collect results: pass/fail per gate, total pass count.
Dispatch all 8 audit agents following the same protocol as skills/audit/SKILL.md:
Scale-aware dispatch: For codebases over 15K lines, use module-targeted dispatch:
src/gateway/, src/orchestrator/, frontend/src/)Check for previous phase baseline:
.vibeos/baselines/phase-[N-1]-baseline.json
If no previous baseline exists (first checkpoint), skip ratchet comparison.
Baseline schema:
{
"phase": N,
"date": "ISO-8601",
"gates": {
"total": N,
"passed": N,
"failed": N,
"gate_results": [{"name": "gate-name", "status": "pass|fail"}]
},
"findings": {
"critical": N,
"high": N,
"medium": N,
"low": N,
"info": N,
"total": N,
"true_positives": N,
"warnings": N
},
"auditors": {
"security": {"status": "complete|failed", "findings": N},
"architecture": {"status": "complete|failed", "findings": N},
"correctness": {"status": "complete|failed", "findings": N},
"test": {"status": "complete|failed", "findings": N},
"evidence": {"status": "complete|failed", "findings": N},
"product-drift": {"status": "complete|failed", "findings": N}
}
}
Compare current results against previous baseline using dual ratchet (count-based + finding-level):
5a. Count-based ratchet (aggregate quality):
5b. Finding-level ratchet (precision tracking, if .vibeos/findings-registry.json exists):
bash ".vibeos/convergence/baseline-check.sh" check \
--mode finding-level \
--baseline-file ".vibeos/baselines/midstream-baseline.json" \
--current-findings-file ".vibeos/findings-registry.json"
This detects:
Both ratchets run in parallel: count-based ensures aggregate quality only improves, finding-level ensures specific findings are tracked and swaps are detected.
Ratchet result:
If ratchet fails, report which categories regressed with consequences:
"Quality regression detected — your codebase has more issues now than at the end of the previous phase:
- Critical findings: [prev] → [current] (+[delta])
- [category]: [prev] → [current] (+[delta])
- New findings detected: [finding IDs] (not in baseline, even though total count unchanged)
Your options:
- Fix the regressions — I'll identify the changes that introduced the new issues and correct them.
- Pros: keeps quality improving and preserves the ratchet
- Cons: adds more work before the phase can close
- Technical note: the current phase baseline stays unchanged until the regressions are cleared
- Review and accept — You inspect each new finding and explicitly choose which ones to keep.
- Pros: useful when a trade-off is intentional and understood
- Cons: those issues become the new normal and will stop being flagged as regressions
- Technical note: accepted findings update the baseline
- Roll back and rebuild — Revert the changes that introduced regressions and try a different implementation path.
- Pros: cleanest recovery path when the current approach is fundamentally flawed
- Cons: loses recent work and takes longer
- Technical note: this restores the codebase to the previous known-good state before rebuilding
I recommend option 1 because regressions are usually unintentional and fixing them now prevents compounding issues in later phases."
Save current results as the baseline for this phase:
mkdir -p .vibeos/baselines
Write to .vibeos/baselines/phase-[N]-baseline.json using the schema above.
Write the report to stdout and save to .vibeos/baselines/phase-[N]-report.md:
## Phase [N] Checkpoint Report
**Date:** [today]
**Phase:** [N] — [phase name]
**WOs completed:** [list]
### Gate Results
| Gate | Status |
|---|---|
| [gate-name] | PASS/FAIL |
**Total:** [passed]/[total] gates passing
### Audit Findings
| Severity | Count | Consensus | Warnings |
|---|---|---|---|
| Critical | [N] | [N] true positives | [N] warnings |
| High | [N] | [N] true positives | [N] warnings |
| Medium | [N] | [N] true positives | [N] warnings |
| Low | [N] | [N] true positives | [N] warnings |
### Baseline Comparison
| Metric | Previous (Phase [N-1]) | Current (Phase [N]) | Delta | Status |
|---|---|---|---|---|
| Gates passing | [N] | [N] | [+/-N] | PASS/FAIL |
| Critical findings | [N] | [N] | [+/-N] | PASS/FAIL |
| High findings | [N] | [N] | [+/-N] | PASS/FAIL |
| Total findings | [N] | [N] | [+/-N] | PASS/FAIL |
### Ratchet Status: [PASS/FAIL]
### Overall Assessment
[1-3 sentence plain English assessment]
**Recommendation:** [proceed to Phase N+1 / fix regressions first]
npx claudepluginhub chieflatif/vibeos-plugin --plugin vibeosValidates phase gates (LOM, ABM, IOC, PRM) by inventorying artifacts, dispatching multi-agent validators, aggregating results, and generating pass/fail reports for phase transitions.
Defines canonical quality check commands for typecheck, test, and lint with four variants (Baseline, Incremental, Full Gate, Per-File). Reference skill consumed by session-start, wave-executor, session-end, and session-reviewer.
Performs comprehensive quality audits verifying planning conformance, DDD validation, security checks, tests, browser verification, and metrics before deployment or PR merge.