From claude-scholar
Runs rigorous statistical analysis for ML/AI experiments: validates artifacts, computes descriptive/inferential stats, generates scientific figures, and surfaces missing evidence.
How this skill is triggered — by the user, by Claude, or both
Slash command
/claude-scholar:results-analysisThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Run **strict, evidence-first experimental analysis** for ML/AI research.
USAGE.mdexamples/example-analysis-report.mdexamples/example-figure-catalog.mdexamples/example-stats-appendix.mdreferences/analysis-depth.mdreferences/common-pitfalls.mdreferences/figure-interpretation.mdreferences/statistical-methods.mdreferences/statistical-reporting.mdreferences/visualization-best-practices.mdRun strict, evidence-first experimental analysis for ML/AI research.
Use this skill to produce a strict analysis bundle:
analysis-report.mdstats-appendix.mdfigure-catalog.mdfigures/When the user asks for review, audit, no-write, dry-run, or when inputs are incomplete, use read-only audit mode instead of producing files or figures. In that mode, output only valid/invalid statistics, blockers, claim candidates, and what evidence is missing. If invoked by /analyze-results, the command layer may write a blocker summary, but this skill should not create figures, reports, or polished conclusions from incomplete evidence.
Do not use this skill to draft a paper Results section or a full experiment wrap-up report. Those belong to ml-paper-writing or results-report.
Results prose,pubfig / pubtab,If the user wants the complete post-experiment summary report, hand off to results-report after this bundle is ready. If the user wants publication-grade figures/tables, export parameters, publication QA, or figure/table redesign, hand off to publication-chart-skill.
Start by identifying:
csv, json, tsv, logs),Validate:
If the comparison is not statistically valid, say so before continuing. Do not treat repeated subject × task rows, folds, windows, trials, or seeds as independent units unless the design justifies it.
Common blocker: a subject × task summary table is usually a repeated-measure summary, not an independent subject-level sample. If subjects have multiple task rows or missing task cells, state that before any significance or winner claim.
Before running statistics, define the exact comparison questions:
Do not mix unrelated comparisons into one undifferentiated table.
Always produce:
mean ± std when appropriate,95% CI or another clearly justified interval,Default expectation:
See:
references/statistical-methods.mdreferences/statistical-reporting.mdProduce actual figures whenever artifacts are available.
Minimum expectation for a non-trivial analysis bundle:
Every main figure must define:
See:
references/visualization-best-practices.mdreferences/figure-interpretation.mdanalysis-report.mdSummarize:
Each claim candidate should use this shape:
## Claim Candidates
- Claim:
- Source evidence:
- Allowed wording:
- Forbidden stronger wording:
- Uncertainty:
- Next check:
- Decision: keep | weaken | revise | discard
stats-appendix.mdRecord:
figure-catalog.mdFor each figure, record:
Do not finish until all are true:
Results draft is included.analysis-output/
├── analysis-report.md
├── stats-appendix.md
├── figure-catalog.md
└── figures/
├── figure-01-main-comparison.pdf
├── figure-02-ablation.pdf
└── ...
For every major figure, answer all three questions:
If a figure cannot answer question 3, it is probably decorative rather than scientific.
Use this mode when:
Return:
Do not create analysis-output/, figures, or reports in this mode.
Quarantine any statistics file whose interpretation contradicts its own p-value, test method, unit of analysis, or comparison family. Do not reuse that file for claim wording until provenance is checked.
When inputs are incomplete, say so explicitly.
Examples:
Never replace missing evidence with confident prose.
Load only what is needed:
references/statistical-methods.md - test selection and assumptionsreferences/statistical-reporting.md - minimum reporting standardreferences/visualization-best-practices.md - publication-quality figure rulesreferences/figure-interpretation.md - how to explain figures with evidencereferences/analysis-depth.md - move from observation to mechanism and decisionreferences/common-pitfalls.md - common analysis and reporting failures../research-ideation/references/research-contract.md - shared claim candidate and claim strength contractexamples/example-analysis-report.mdexamples/example-stats-appendix.mdexamples/example-figure-catalog.mdnpx claudepluginhub galaxy-dawn/claude-scholar --plugin claude-scholarAnalyzes experimental results, model outputs, and data with statistical rigor and diagnostic depth.
Analyzes experiment results from tables, stats, or descriptions to generate LaTeX discussion paragraphs for academic papers via two-phase workflow: extracts findings for user confirmation, then writes grounded analysis.
Writes structured post-experiment research reports after analysis artifacts are ready. Produces decision-oriented narratives with statistical validation and next actions, writing into an Obsidian vault.