Skill

curate-genome-assembly

From curation-skills

Process genome assembly datasets for VEuPathDB resources

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/curation-skills:curate-genome-assembly

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

This skill guides processing of genome assembly datasets for VEuPathDB resources.

Supporting Files

TODO.mdresources/curator-branching.mdresources/editing-large-xml.mdresources/step-1-fetch-ncbi.mdresources/step-2-fetch-bioproject.mdresources/step-3-fetch-pubmed.mdresources/step-4-curate-contacts.mdresources/step-5-update-presenter.mdresources/valid-projects.jsonscripts/check-repos.shscripts/fetch-bioproject.jsscripts/fetch-pubmed.jsscripts/generate-presenter-xml.js

SKILL.md

145 lines · ~1.3k tokens

Stats

LanguageJavaScript

Stars0

MaintenanceExcellent

Last CommitApr 30, 2026

Actions

View Source View Plugin View on GitHub View README

Stats

Actions

Genome Assembly Dataset Curation

This skill guides processing of genome assembly datasets for VEuPathDB resources.

Prerequisites Check

This workflow requires the following repositories in veupathdb-repos/:

ApiCommonPresenters
EbrcModelCommon

First, run the repository status check to verify repositories are present:

Note: this script is located in the skill directory

bash scripts/check-repos.sh ApiCommonPresenters EbrcModelCommon

If repositories are missing, the script will provide clone instructions.

Branch Confirmation: After verifying repositories exist, check their current branches and status using git -C <path>, then confirm with the user before proceeding. Users typically create dataset-specific branches (see curator branching guidelines).

Example:

git -C veupathdb-repos/ApiCommonPresenters branch --show-current
git -C veupathdb-repos/ApiCommonPresenters status -sb

Working Directory (Curation Workspace Directory)

IMPORTANT: All commands in this workflow must be run from your curation workspace directory (the directory that contains veupathdb-repos/ as a subdirectory).

For Claude Code:

DO NOT use cd commands to change into veupathdb-repos/ subdirectories
Use git -C <path> for git operations in subdirectories
Use absolute paths or relative paths from the curation workspace directory
Example: git -C veupathdb-repos/ApiCommonPresenters status instead of cd veupathdb-repos/ApiCommonPresenters && git status

The workflow will create a tmp/ subdirectory in the curation workspace directory for intermediate files.

Required Information

Gather the following before starting:

VEuPathDB project - Valid projects listed in resources/valid-projects.json
Assembly GenBank accession (e.g., GCA_000988875.2 including version)

Workflow Overview

Step 1: Fetch Assembly Metadata from NCBI

Fetch assembly metadata from NCBI using the GenBank accession.

Command:

curl -X GET "https://api.ncbi.nlm.nih.gov/datasets/v2/genome/accession/<ASSEMBLY_ACCESSION>/dataset_report" \
  -H "Accept: application/json" > tmp/<ASSEMBLY_ACCESSION>_dataset_report.json

Detailed instructions: Step 1 - Fetch NCBI Metadata

Step 2: Fetch BioProject Metadata

Extract the BioProject accession from the assembly report and fetch additional details.

Command:

node scripts/fetch-bioproject.js <BIOPROJECT_ACCESSION>

This retrieves the BioProject title and description, saved to tmp/<BIOPROJECT>_bioproject.json.

Detailed instructions: Step 2 - Fetch BioProject

Step 3: Fetch PubMed Data

Find and fetch publications for the genome assembly.

Command:

node scripts/fetch-pubmed.js <ASSEMBLY_ACCESSION>

Results saved to tmp/<ASSEMBLY_ACCESSION>_pubmed.json.

Detailed instructions: Step 3 - Fetch PubMed

Step 4: Curate Contacts

Identify and curate contact entries for the genome submission.

Contact identification priority:

Named submitter from assembly metadata
Senior/last author from PubMed publications (if available)
Curator judgment for additional contacts

Actions:

Search existing contacts in veupathdb-repos/EbrcModelCommon/Model/lib/xml/datasetPresenters/contacts/allContacts.xml
Create new contact entries if needed
Present choices to curator for review

Detailed instructions: Step 4 - Curate Contacts

Step 5: Generate and Insert Presenter XML

Generate the datasetPresenter XML and insert it into the appropriate presenter file.

Command:

node scripts/generate-presenter-xml.js <ASSEMBLY_ACCESSION> <PROJECT> <PRIMARY_CONTACT_ID> [ADDITIONAL_CONTACT_IDS...]

Target file: veupathdb-repos/ApiCommonPresenters/Model/lib/xml/datasetPresenters/<PROJECT>.xml

Detailed instructions: Step 5 - Update Presenter Files

Next Steps

After completing this workflow:

Review generated XML for TODO fields that require curator input
Commit changes to dataset branch (curator handles git operations)
Create pull request for review (curator handles PR creation)

Resources

Scripts

scripts/fetch-bioproject.js - Fetches BioProject metadata from NCBI (esearch + esummary)
scripts/fetch-pubmed.js - Fetches PubMed records linked to a BioProject (elink + esummary)
scripts/generate-presenter-xml.js - Generates datasetPresenter XML from fetched metadata
scripts/check-repos.sh - Validates veupathdb-repos/ repository setup (synced from shared/)

curate-genome-assembly

Invocation

Context Preview

Supporting Files

SKILL.md

curate-genome-assembly

Invocation

Context Preview

Supporting Files

SKILL.md

Genome Assembly Dataset Curation

Prerequisites Check

Working Directory (Curation Workspace Directory)

Required Information

Workflow Overview

Step 1: Fetch Assembly Metadata from NCBI

Step 2: Fetch BioProject Metadata

Step 3: Fetch PubMed Data

Step 4: Curate Contacts

Step 5: Generate and Insert Presenter XML

Next Steps

Resources

Scripts

Similar Skills

Genome Assembly Dataset Curation

Prerequisites Check

Working Directory (Curation Workspace Directory)

Required Information

Workflow Overview

Step 1: Fetch Assembly Metadata from NCBI

Step 2: Fetch BioProject Metadata

Step 3: Fetch PubMed Data

Step 4: Curate Contacts

Step 5: Generate and Insert Presenter XML

Next Steps

Resources

Scripts

Similar Skills