From compound-ml
Profile a dataset and narrate findings in plain language. Use when the user wants to understand their data, explore a CSV/JSON/Parquet file, get a data summary, or says 'explore this data', 'what does this dataset look like', or 'profile my data'.
How this skill is triggered — by the user, by Claude, or both
Slash command
/compound-ml:ml-exploreThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Automated exploratory data analysis that profiles a dataset and narrates findings in plain language. No ML expertise required — the analysis runs automatically and results are explained without jargon.
Automated exploratory data analysis that profiles a dataset and narrates findings in plain language. No ML expertise required — the analysis runs automatically and results are explained without jargon.
This skill does NOT generate embeddings or run ML algorithms. It profiles the data structure, distributions, and quality. For clustering, anomaly detection, or deeper analysis, use ml-cluster, ml-anomalies, or ml-analyze.
The user provides a file path or directory as an argument:
If no argument is provided, scan the working directory for data files and ask which one to profile.
Verify Python and pandas are available:
python3 -c "import pandas; print(f'pandas {pandas.__version__}')"
If pandas is not installed, report:
pandas is required for data exploration. Install it with:
uv pip install pandasSee
references/setup.mdfor full setup instructions.
Do not proceed until the environment check passes.
Run a Python script via Bash(python3 -c "...") that:
Print all output as structured text that the LLM can interpret.
Run a Python script that:
.txt, .md, .text files in the directory (non-recursive by default)Attempt to generate distribution visualizations:
python3 -c "import matplotlib"Using the profiling output from Phase 2, generate a plain-language narrative summary. The summary must:
ml-cluster for topic discovery or ml-anomalies for outlier detectionml-clusterml-anomaliesml-rag for building a searchable knowledge basePresent findings as a structured narrative:
## Data Profile: [filename]
**Shape:** [rows] rows x [columns] columns
**Sampled:** [Yes/No — if sampled, note original size]
### Summary
[2-3 sentence plain-language overview of the dataset]
### Column Details
[Table or list of column profiles]
### Data Quality
[Missing values, type issues, notable patterns]
### Suggested Next Steps
[1-3 actionable suggestions based on findings]
nrows=50000 parameterreferences/setup.md — Full environment setup guidereferences/data-formats.md — Supported formats and column type handlingGuides creation, editing, and verification of skills for AI coding agents using test-driven development with subagent scenarios. Use when authoring or debugging skills.
npx claudepluginhub milasaurus/compound-ml --plugin compound-ml