Assesses ML pipeline stage and applies patterns for data pipelines, model training, serving, MLOps, evaluation, and debugging with validations like schema checks, drift detection, and skew guards.
How this skill is triggered — by the user, by Claude, or both
Slash command
/claude-code-superpowers:gradientThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
**GRADIENT** — *In ML, the gradient is the directional signal that tells you exactly how to improve.*
GRADIENT — In ML, the gradient is the directional signal that tells you exactly how to improve. When invoked: assesses pipeline stage (data / training / serving / MLOps), loads the relevant pattern file, and applies ML-specific validation — schema checks, drift detection, training-serving skew guards, latency budgets.
Core principle: ML systems have unique failure modes — data drift, training-serving skew, silent degradation. Test data and models, not just code.
Announce at start: "Running GRADIENT for ML-specific patterns."
STAGE ASSESSMENT:
"What stage are you at?"
A) Data collection / ingestion
B) Feature engineering / preprocessing
C) Model training / experimentation
D) Model evaluation / validation
E) Model serving / inference
F) Production monitoring / MLOps
G) Debugging an ML failure
Stage → Section mapping:
hunter first, then return with evidenceAfter identifying stage, ask: "What's the primary constraint — accuracy, latency, cost, or reliability?"
Load patterns: patterns/data-pipeline.md
Key tests to implement:
Rule: Write pipeline tests before writing model code.
Load patterns: patterns/model-training.md
Before training complex models:
Load patterns: patterns/model-serving.md
Tests required before deployment:
Load patterns: patterns/mlops.md
Required components:
| Task | Metrics |
|---|---|
| Classification | accuracy, precision, recall, F1, AUC-ROC, AUC-PR |
| Regression | MAE, MSE, RMSE, R², MAPE |
| Ranking | NDCG, MAP, MRR |
Cost/latency budgets (set before training, enforce in CI):
Never:
Always:
| Skill | Integration |
|---|---|
forge | Write data tests before pipeline, model tests before training |
hunter | Use for training failures, accuracy drops |
sentinel | Verify metrics before claiming model works |
chronicle | Store patterns from failed experiments |
npx claudepluginhub gadaalabs/claude-code-on-steroidsTurns model work into production ML systems with data contracts, repeatable training, quality gates, deployable artifacts, and monitoring. Useful for ranking, search, recommendations, classifiers, forecasting, embeddings, LLMs, anomaly detection, and batch analytics.
Audits ML pipeline reproducibility, experiment tracking hygiene, and model versioning. Advises on serving patterns and prompt evaluation across MLflow, W&B, SageMaker, Vertex AI.
Production ML engineering workflow covering data contracts, reproducible training, evaluation, deployment, monitoring, and rollback. Activates when planning, reviewing, or hardening ML systems beyond notebook prototypes.