From rune
Runs post-deploy health checks on web apps: HTTP status, response times, error detection, performance patterns, smoke test report. Auto-triggers after deploy/launch.
How this skill is triggered — by the user, by Claude, or both
Slash command
/rune:watchdogThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Post-deploy monitoring. Receives a deployed URL and list of expected endpoints, runs health checks, measures response times, detects errors, and returns a structured smoke test report.
Post-deploy monitoring. Receives a deployed URL and list of expected endpoints, runs health checks, measures response times, detects errors, and returns a structured smoke test report.
deploy (L2): post-deploy monitoring setuplaunch (L1): monitoring as part of launch pipelineincident (L2): current system state check during incident triageNone — pure L3 utility.
Accept input from calling skill:
base_url — deployed application URL (e.g. https://myapp.com)endpoints — list of paths to check (e.g. ["/", "/health", "/api/status"])If no endpoints provided, default to: ["/", "/health", "/ready"]
For each endpoint, run an HTTP status check using Bash:
curl -s -o /dev/null -w "%{http_code}" https://myapp.com/health
For each endpoint, measure latency using Bash:
curl -s -o /dev/null -w "%{time_total}" https://myapp.com/health
Thresholds:
2000ms → SLOW (flag as alert)
After collecting response times from Step 3, analyze for patterns that indicate root causes:
PERF_WARN — investigate N+1 query or missing DB indexPERF_WARN — connection pool saturation likelyPERF_REGRESSION — correlate with recent git diffIf no previous baseline exists, skip spike detection and note INFO: no baseline — first run.
Output performance signals into a perf_signals list (separate from alerts).
Scan responses for problems:
Collect all flagged issues into an alerts list.
Output the following report structure:
## Watchdog Report: [base_url]
### Smoke Test Results
- [endpoint] — [HTTP status] ([response_time]s) — [HEALTHY|REDIRECT|CLIENT_ERROR|SERVER_ERROR|UNREACHABLE]
### Alert Rules Applied
- Response time > 2s → alert
- Any 4xx on non-auth endpoint → alert
- Any 5xx → critical alert
- Unreachable → critical alert
### Alerts
- [CRITICAL|WARNING] [endpoint] — [reason]
### Performance Signals
- [PERF_WARN|PERF_REGRESSION|INFO] [endpoint] — [diagnosis]
### Summary
- Total endpoints checked: [n]
- Healthy: [n]
- Alerts: [n]
- Perf Signals: [n]
- Overall status: ALL_HEALTHY | DEGRADED | DOWN
If no alerts and no perf signals: output Overall status: ALL_HEALTHY.
## Watchdog Report: [base_url]
### Smoke Test Results
- / — 200 (0.231s) — HEALTHY
- /health — 200 (0.089s) — HEALTHY
- /api/status — 500 (1.203s) — SERVER_ERROR
### Alerts
- CRITICAL /api/status — HTTP 500
### Summary
- Total: 3 | Healthy: 2 | Alerts: 1
- Overall status: DEGRADED
perf skill — watchdog is a detector, not a diagnoserKnown failure modes for this skill. Check these before declaring done.
| Failure Mode | Severity | Mitigation |
|---|---|---|
| curl timeout treated as slow (not unreachable) | HIGH | Non-zero curl exit code = UNREACHABLE, not a response time measurement |
| PERF_REGRESSION reported without baseline | MEDIUM | Only flag regression if a previous run exists — otherwise INFO: first run |
| All endpoints flagged SLOW because test env is slow | MEDIUM | Note environment context — add ENV: non-production detected if URL contains dev/staging/localhost |
| Perf signal without actionable diagnosis | LOW | Every PERF_WARN must include a hypothesis (N+1, pool saturation, etc.) |
alerts list, all SLOW → alerts listperf_signals list (or INFO: first run)~500-1500 tokens input, ~300-800 tokens output. Sonnet for configuration quality.
npx claudepluginhub rune-kit/rune --plugin @rune/analyticsMonitors deployed URLs after releases: checks HTTP status, console errors, static assets, SSE streams, and performance regressions. Use after deploys, risky merges, or dependency upgrades.
Monitors deployed web app URLs for post-deploy regressions including HTTP status, console errors, network failures, performance metrics (LCP/CLS/INP), content changes, and API health with alerts and looping checks.
Verifies production site health post-deploy via canary monitoring: checks HTTP status, response time, and error patterns, compares against baseline to detect regressions, and guides rollback decisions.