From Builder Skills
Audits, reviews, and monitors another agent's work from session IDs, transcripts, PRs, branches, or logs. Reconstructs the original request, inspects evidence, and reports gaps.
How this skill is triggered — by the user, by Claude, or both
Slash command
/builder-skills:agent-watchdogThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Watch another agent's work like a reviewer with a pager: wait for completion
Watch another agent's work like a reviewer with a pager: wait for completion when needed, reconstruct the request, verify the evidence, and close the gap between what was asked and what actually happened.
Infer the mode from the user's wording:
If authority is unclear, default to audit-only and say what you would fix.
Build a compact contract before judging the work:
Treat the user's request as the source of truth, not the other agent's summary.
Inspect evidence, not vibes:
Classify each issue as:
When the user authorized repair:
Lead with the outcome. Keep the report short enough to scan:
Status
- Done, blocked, stale, or still running.
Requested
- What the user asked the watched agent to do.
Observed
- What the watched agent changed, claimed, and verified.
Gaps
- Missing behavior, bugs, weak verification, or scope drift.
Fixes made
- Files changed and validation run. Omit this section for audit-only work.
Remaining risk
- Anything still unverified or waiting on CI/review/deploy/human input.
Name exact files, commands, PRs, or thread IDs when they matter.
npx claudepluginhub builderio/skillsAudits local AI coding-agent sessions with agenttrace for cost, tool failures, latency, anomalies, health scores, diffs, and CI gates. Use when a run was slow, expensive, or unreliable.
Runs a multi-agent code review assessing architecture, simplicity, maintainability, correctness, and more. Launches four parallel agents (reviewer, pre-mortem, adversarial, documentation) and triages findings.
Analyzes current Claude Code session for agent efficiency (tool precision, autonomy) and quality (CLAUDE.md compliance, code patterns), scoring dimensions and surfacing 2-3 actionable improvements.