From mlforge
This skill should be used after a model deployment — when the user says "canary", "watch the rollout", "monitor the new model", "is the deploy healthy", "post-deploy check", or a release produced a CANARY_PLAN.md that now needs executing. Audits production behavior against the model card's stated expectations.
How this skill is triggered — by the user, by Claude, or both
Slash command
/mlforge:ml-canaryThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Post-deploy watch. The model card made promises; the canary checks them. This is the boomerang — plan vs reality, made explicit and written down.
Post-deploy watch. The model card made promises; the canary checks them. This is the boomerang — plan vs reality, made explicit and written down.
ml/releases/<version>/MODEL_CARD.md ("Expected production behavior" section) and CANARY_PLAN.md. No model card → run against best-known expectations, and note that the release skipped ml-ship (a process leak for the next retro).
ml-production-debug step 3 (don't wait for the online metric to confirm what the score histogram already shows).Each ramp stage (shadow → N% → 100%) advances only on its CANARY_PLAN numeric gates. A failed gate triggers the rollback plan — which the card already wrote, with numeric triggers and an owner. Execute it; don't renegotiate it mid-incident.
Append to ml/releases/<version>/CANARY_REPORT.md per checkpoint:
## Checkpoint [stage, date]
| Check | Card said | Production says | Verdict |
[score dist | nulls | latency | flips | online metric]
**Decision**: advance / hold / rollback — [numeric reason]
On completion (100% + horizon passed): final verdict in the report, canary_passed (or rolled_back) gate appended to ml/gates.json, outcome logged in experiments/journal.md, and any card-vs-reality miss handed to ml-retro's boomerang audit — systematically wrong cards are a calibration problem, not bad luck.
npx claudepluginhub mbburabak/mlforge --plugin mlforgeProvides UI/UX resources: 50+ styles, color palettes, font pairings, guidelines, charts for web/mobile across React, Next.js, Vue, Svelte, Tailwind, React Native, Flutter. Aids planning, building, reviewing interfaces.
Fetches up-to-date documentation from Context7 for libraries and frameworks like React, Next.js, Prisma. Use for setup questions, API references, and code examples.