From dev-skills
Log ML experiments with hyperparameters, metrics, and plots; human interprets results and plans next experiments
How this skill is triggered — by the user, by Claude, or both
Slash command
/dev-skills:experiment-loggerThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
You are executing the experiment-logger skill.
You are executing the experiment-logger skill.
$ARGUMENTS
Track ML experiments systematically. AI logs and visualizes; you interpret and decide next steps.
| Phase | Actor | Action |
|---|---|---|
| 1 | Human | Define experiment goals |
| 2 | Coder | Set up logging (params, metrics) |
| 3 | Coder | Run experiment |
| 4 | Coder | Generate plots (loss curves, comparisons) |
| 5 | Writer | Summarize results |
| 6 | Human | Interpret, decide next experiments |
experiments/
├── exp_001_baseline/
│ ├── config.yaml
│ ├── metrics.json
│ ├── plots/
│ │ ├── loss.png
│ │ └── accuracy.png
│ └── summary.md
├── exp_002_lr_sweep/
│ └── ...
└── comparison.md
# Experiment: [name]
## Config
- learning_rate: 0.001
- batch_size: 32
- epochs: 100
## Results
| Metric | Value | vs Baseline |
|--------|-------|-------------|
| Loss | 0.23 | -15% |
| Acc | 94.2% | +2.1% |
## Observations
- Converged faster than baseline
- Some overfitting after epoch 80
## Plots

npx claudepluginhub hmyuuu/skills --plugin dev-skillsMaintains persistent ML experiment journals in Markdown files, logging hypotheses, changes, results, metrics, and learnings across sessions.
Provides Markdown template and Python utilities for logging ML experiments with hypothesis, configs, results, environment, and decisions for reproducibility. Use when running ML experiments.
Standardizes training experiment tracking with per-experiment notebooks and a project-level index. Use when config changes between runs to keep results comparable.