From sdrf-skills
Guides users through contributing an annotated SDRF file for a ProteomeXchange dataset to the bigbio/sdrf-annotated-datasets community repository via a pull request.
How this skill is triggered — by the user, by Claude, or both
Slash command
/sdrf-skills:sdrf-contribute [PXD accession and SDRF file path][PXD accession and SDRF file path]The summary Claude sees in its skill listing — used to decide when to auto-load this skill
You are helping the user contribute an annotated SDRF file back to the community repository
You are helping the user contribute an annotated SDRF file back to the community repository
(bigbio/sdrf-annotated-datasets). This is the final step after annotation, validation,
and review — closing the loop from "I annotated a dataset" to "the community can reuse it."
/sdrf:annotate, the content is in the conversationCheck if the PXD already exists in the community repository
(bigbio/sdrf-annotated-datasets):
Look for: datasets/{PXD}/{PXD}.sdrf.tsv
You can check via the GitHub API without cloning:
gh api repos/bigbio/sdrf-annotated-datasets/contents/datasets/{PXD} \
--silent && echo "exists" || echo "new"
Before contributing, the SDRF must pass validation:
Suggest programmatic validation:
pip install sdrf-pipelines
parse_sdrf validate-sdrf --sdrf_file {PXD}.sdrf.tsv
Optionally run /sdrf:validate for a thorough check including ontology verification
Check file structure:
.sdrf.tsvDo NOT proceed to contribution if there are validation errors. Warnings are acceptable — mention them but allow the user to proceed.
The community repository (bigbio/sdrf-annotated-datasets) uses this structure:
datasets/
└── {PXD}/
└── {PXD}.sdrf.tsv
For datasets with multiple sub-experiments:
datasets/
└── {PXD}/
├── {PXD}-celllines.sdrf.tsv
└── {PXD}-tissues.sdrf.tsv
Save the SDRF content to the correct path:
{PXD}/{PXD}.sdrf.tsv
Ensure the file:
Ask the user which mode they prefer:
gh CLI is available)Execute the full contribution flow:
# 1. Fork the repository (if not already forked)
gh repo fork bigbio/sdrf-annotated-datasets --clone=false
# 2. Clone the user's fork
gh repo clone {username}/sdrf-annotated-datasets /tmp/sdrf-annotated-datasets
cd /tmp/sdrf-annotated-datasets
# 3. Create a branch
git checkout -b annotation/{PXD}
# 4. Create the directory and copy the file
mkdir -p datasets/{PXD}
cp {source_path}/{PXD}.sdrf.tsv datasets/{PXD}/
# 5. Commit
git add datasets/{PXD}/
git commit -m "Add SDRF annotation for {PXD}"
# 6. Push
git push -u origin annotation/{PXD}
# 7. Create the PR
gh pr create \
--repo bigbio/sdrf-annotated-datasets \
--title "Add SDRF annotation for {PXD}" \
--body "$(cat <<'EOF'
## Add SDRF annotation for {PXD}
**Dataset**: [{PXD}](https://www.ebi.ac.uk/pride/archive/projects/{PXD})
**Organism**: {organism}
**Templates**: {template_list}
**Rows**: {row_count} | **Columns**: {col_count}
**Factor values**: {factor_description}
### Validation
- [x] Validated with `sdrf-pipelines validate-sdrf`
- [x] Ontology terms verified via OLS
### Annotation source
Annotated using [sdrf-skills](https://github.com/bigbio/sdrf-skills).
EOF
)"
Before executing, present the full plan to the user and ask for confirmation. Fill in the template variables from the SDRF content:
{organism}: from characteristics[organism] unique values{template_list}: from comment[sdrf template] columns{row_count}: number of data rows (excluding header){col_count}: number of columns{factor_description}: from factor value[...] column names and unique valuesPresent the same sequence of commands as Mode A, but as a copyable code block with all variables already filled in. The user copies and executes them.
Preface with:
Here are the commands to contribute your SDRF annotation to the community repository.
Copy and run them in your terminal:
If the PXD already exists, adjust:
Update SDRF annotation for {PXD}Update SDRF annotation for {PXD}annotation/{PXD}-updateAfter the PR is created:
sdrf-pipelines validate-sdrf on the PR/sdrf:fix and /sdrf:improve are for)gh CLI installed, always fall back to Mode B (guided commands)Guides creation, editing, and verification of skills for AI coding agents using test-driven development with subagent scenarios. Use when authoring or debugging skills.
npx claudepluginhub bigbio/sdrf-skills