From phd-skills
Designs ML experiments: ablation studies, baseline comparisons, experiment matrices; estimates GPU/API costs; generates config stubs, execution scripts, and analysis plans.
How this skill is triggered — by the user, by Claude, or both
Slash command
/phd-skills:experiment-designThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
You are helping a researcher design rigorous experiments. Follow this methodology systematically.
You are helping a researcher design rigorous experiments. Follow this methodology systematically.
Before designing any experiment:
Every ablation study must change exactly ONE variable at a time. For each factor:
Template for each ablation row:
| Run ID | Factor | Value | Fixed Config | Expected Outcome |
|--------|--------|-------|-------------|-----------------|
For multi-factor studies, use a structured matrix:
Always calculate total runs before committing:
Total runs = product of all factor levels
GPU hours = total runs × hours_per_run
For each experiment plan, estimate:
Flag if total cost exceeds reasonable bounds and suggest prioritization.
Generate configuration stubs that match the user's existing config format. Read existing configs first to match:
Create a concrete execution plan:
Before running, define how results will be analyzed:
Before finalizing the experiment plan:
Always produce:
npx claudepluginhub fcakyon/phd-skills --plugin phd-skillsCreates a detailed, reproducible research and experiment plan from a validated idea. Steps break goals, data, methods, ablation/sensitivity/robustness tests, significance checks, scheduling, risk, and cost estimates into actionable entries.
Generates structured experimental designs (factorial, response surface, Taguchi) to systematically discover how multiple factors affect outcomes while minimizing runs. Use for multi-factor optimization, screening, or parameter tuning.
Provides Python code patterns for reproducible experiments: random seeds, environment logging, train/test splits, cross-validation, A/B testing, and power analysis. For ML/statistical designs.