From agentic-council
Use when designing end-to-end ML workflows. Covers experiment tracking, feature engineering and storage, model training pipelines, serving and deployment, A/B testing, and drift monitoring. Do not use for data warehouse schema design (use schema-evaluation) or ETL pipeline architecture (use pipeline-design).
How this skill is triggered — by the user, by Claude, or both
Slash command
/agentic-council:alchemist-ml-workflowThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Design end-to-end ML workflows covering experiment tracking, feature engineering and storage, model training pipelines, model serving and deployment, A/B testing for models, and monitoring for data and model drift. Produces a workflow architecture, tool selection rationale, and operational runbook.
Design end-to-end ML workflows covering experiment tracking, feature engineering and storage, model training pipelines, model serving and deployment, A/B testing for models, and monitoring for data and model drift. Produces a workflow architecture, tool selection rationale, and operational runbook.
Reads ML code, configuration files, experiment logs, and infrastructure specs for analysis. Does not train models, execute experiments, or deploy to production.
No user-provided values are used in commands or file paths. All inputs are treated as read-only analysis targets.
Before any tooling decisions, formalize:
Document the problem statement, target variable, evaluation metric, and success threshold.
Map raw data to model-ready features:
Set up reproducible experiment management:
Build a reproducible, automated training workflow:
Plan how predictions reach users:
For real-time serving, specify: latency SLA (p50/p99), throughput (requests/second), scaling strategy (auto-scale triggers), and fallback behavior (what happens if the model is unavailable?).
Plan controlled rollout of model changes:
Plan ongoing model health monitoring:
Define retraining policy: scheduled (weekly/monthly), triggered (drift detected), or continuous (online learning).
Compaction resilience: If context was lost during a long session, re-read the Inputs section to reconstruct what system is being analyzed, check the Progress Checklist for completed steps, then resume from the earliest incomplete step.
# ML Workflow: [Project/Model Name]
## Problem Definition
| Aspect | Detail |
|--------|--------|
| Problem type | ... |
| Target variable | ... |
| Business metric | ... |
| Evaluation metric | ... |
| Baseline performance | ... |
| Success threshold | ... |
## Feature Engineering
| Feature | Source | Transformation | Type | Leakage Risk |
|---------|--------|---------------|------|-------------|
| ... | ... | ... | ... | Low/Med/High |
**Feature store:** [Yes/No — tool choice and rationale]
## Experiment Tracking
| Aspect | Choice | Rationale |
|--------|--------|-----------|
| Tool | ... | ... |
| What's tracked | ... | ... |
| Organization | ... | ... |
## Training Pipeline
[ASCII diagram showing data → features → train → evaluate → register]
| Stage | Tool/Method | Notes |
|-------|------------|-------|
| Data split | ... | ... |
| Training | ... | ... |
| Tuning | ... | ... |
| Validation | ... | ... |
| Registry | ... | ... |
## Model Serving
| Aspect | Detail |
|--------|--------|
| Serving mode | Batch / Real-time / Streaming / Edge |
| Latency SLA | ... |
| Throughput | ... |
| Scaling | ... |
| Fallback | ... |
## A/B Testing
| Aspect | Detail |
|--------|--------|
| Traffic split | ... |
| Primary metric | ... |
| Guardrail metrics | ... |
| Min duration | ... |
| Rollback criteria | ... |
## Monitoring and Drift
| Monitor | Tool | Threshold | Action |
|---------|------|-----------|--------|
| Data drift | ... | ... | ... |
| Model drift | ... | ... | ... |
| Concept drift | ... | ... | ... |
| Operational | ... | ... | ... |
**Retraining policy:** [Scheduled / Triggered / Continuous — details]
npx claudepluginhub dtsong/agentic-council --plugin agentic-councilProvides UI/UX resources: 50+ styles, color palettes, font pairings, guidelines, charts for web/mobile across React, Next.js, Vue, Svelte, Tailwind, React Native, Flutter. Aids planning, building, reviewing interfaces.
Fetches up-to-date documentation from Context7 for libraries and frameworks like React, Next.js, Prisma. Use for setup questions, API references, and code examples.