From synthdata
Extract tabular data from Excel workbooks (.xlsx) to JSON files, one per sheet. Auto-detects whether a sheet has a title-banner row above the headers (synthdata-generate convention) or starts with headers directly. Use this skill when the user wants to convert an Excel file to JSON, extract spreadsheet data, parse an xlsx file, prepare data for downstream analysis tools that don't read Excel natively, or set up a dataset for the other synthdata skills. Also trigger on "extract the data", "parse this spreadsheet", "convert to JSON", or "read this xlsx file".
How this skill is triggered — by the user, by Claude, or both
Slash command
/synthdata:synthdata-extractThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Extract every sheet of one or more Excel workbooks to JSON files. Each sheet becomes a JSON array of row objects keyed by header name.
Extract every sheet of one or more Excel workbooks to JSON files. Each sheet becomes a JSON array of row objects keyed by header name.
pip install openpyxl --break-system-packages
Ask the user which file(s) to extract, or inspect the current directory. Either:
./data.xlsx.xlsx files: ./data/python scripts/extract.py --input ./data.xlsx --output ./json/
# or multiple files / a directory
python scripts/extract.py --input ./data/ --output ./json/
CLI flags:
| Flag | Default | Description |
|---|---|---|
--input | . | Input .xlsx file, or directory containing .xlsx files |
--output | ./json | Output directory for JSON files |
--title-row | auto | auto (detect), yes (skip row 1), no (row 1 is headers) |
--flatten | false | If true, output single file per workbook instead of per sheet |
--indent | 2 | JSON indent (0 for compact) |
The script:
<workbook>_<sheet>.json (or <sheet>.json if one workbook)Print row counts per sheet and the output directory path.
This skill reads Excel files produced by synthdata-generate by default (row 1 = title banner,
row 2 = column headers, row 3+ = data). It also handles plain sheets (row 1 = headers, row 2+ = data)
when --title-row no is set or when auto-detection determines no title row exists.
npx claudepluginhub rappdw/synthdataProvides CDSS development patterns for drug interaction checking, dose validation, clinical scoring (NEWS2, qSOFA), and alert classification integrated into EMR workflows.