Skill

metabolomics-curation

Comprehensive toolkit for curating untargeted metabolomics features from LC-MS data. Use when analyzing metabolomics feature tables from FGCZ or other platforms, reducing misannotation rates through automated QC metrics (CV%, blank ratios), duplicate resolution, and semi-automated flagging. Supports Excel/CSV inputs with ~100-300 features and generates interactive HTML reports following FGCZ standards.

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/metabolomics-data-analysis:metabolomics-curation

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Assist with curating metabolomics features from untargeted LC-MS data analysis. Untargeted metabolomics typically produces 100-300 putative metabolite features, of which 60-70% are low-quality (poor reproducibility, background noise, or duplicate annotations). This skill provides a semi-automated workflow to:

Supporting Files

PYTHON_HTML_REPORT.mdassets/fgcz_curation_report_template.Rmdreferences/curation_workflow.mdreferences/quality_thresholds.mdscripts/calculate_qc_metrics.pyscripts/generate_html_report.pyscripts/generate_html_report_v2.pyscripts/generate_report_with_check.pyscripts/parse_metabolomics_table.pyscripts/resolve_duplicates.py

SKILL.md

682 lines · ~6.1k tokens(exceeds 5k compaction limit)

Stats

LanguagePython

Parent stars0

MaintenanceFair

Last CommitMay 18, 2026

Actions

View Source View Plugin View on GitHub View README

Stats

Actions

Metabolomics Feature Curation

Overview

Parse feature tables - Extract sample metadata from column headers (blanks, QC, biological groups)
Calculate QC metrics - Compute CV%, biological/blank ratios, flag features for review
Resolve duplicates - Rank duplicate annotations by quality scores, recommend best representatives
Generate reports - Create interactive HTML QC reports with visualizations

The workflow reduces manual curation time from hours to minutes while improving consistency and reproducibility.

When to Use This Skill

Invoke this skill when users request:

"Curate metabolomics features"
"QC metabolomics data"
"Remove low-quality metabolites"
"Resolve duplicate metabolite annotations"
"Filter metabolomics features by CV% or blank ratios"
Provide Excel/CSV files with metabolomics features from Compound Discoverer, XCMS, MS-DIAL, or similar platforms

Typical file structure:

100-300 rows (features/metabolites)
100-150 columns (identification, annotations, sample measurements)
Column headers like "Area [SampleName]" or "Norm. Area [SampleName]"
Columns for technical QC (Blank, Pooled QC) and biological samples

Workflow Decision Tree

When a user requests metabolomics curation, follow this decision tree:

Is the input file a metabolomics feature table?
│
├─ YES → Does it have Area/Intensity columns for samples?
│   │
│   ├─ YES → Proceed with Step 1: Parse Metadata
│   │
│   └─ NO → Ask user to provide raw feature table (not processed/filtered)
│
└─ NO → Ask user for clarification or correct file type

Workflow Steps:

Parse Metadata - Extract sample types from column headers
Calculate QC Metrics - Compute CV%, ratios, flag features
Resolve Duplicates - Rank and recommend best representatives
Generate Report - Create interactive HTML with flagged features
Manual Review - User reviews flagged features and duplicates
Export Curated Table - Final filtered feature list

Step 1: Parse Feature Table and Extract Metadata

Goal: Load metabolomics feature table and auto-detect sample types.

Script: scripts/parse_metabolomics_table.py

What it does:

Loads Excel/CSV feature table
Parses technical information from filename and columns (NEW):
- Ionization mode (ESI+/ESI- from filename suffix)
- MS instrument type (Orbitrap, QToF from annotation columns)
- Chromatography method (HILIC, C18 from column references)
- Dataset IDs (oXXXXX patterns from filename)
- MS/MS acquisition mode (MS1, MS2 from filename)
- Acquisition software (Compound Discoverer, UNIFI, MassHunter)
- Why this matters: Technical configuration constrains metabolite search space (e.g., ESI-/HILIC favors polar compounds)
Detects column structure (Area, Norm. Area, Group statistics, annotations)
Extracts sample names from column headers (e.g., "Area [o38905__Untreated_2]" → "Untreated_2")
Auto-classifies sample types:
- Blank - Extraction/run blanks (pattern: "blank")
- Pooled QC - Technical QC replicates (pattern: "pooled qc", not "qcdil")
- Pooled QC Dilution - Serial QC dilutions (pattern: "pooled qc dil")
- Standards - Technical mixtures (pattern: "108mix", "150mix")
- Biological - All other samples (user groups)
Outputs metadata JSON for downstream steps

Usage:

# Basic usage (auto-detect all samples)
python scripts/parse_metabolomics_table.py input_features.xlsx --output metadata.json

# Interactive mode (review and adjust classifications)
python scripts/parse_metabolomics_table.py input_features.xlsx --output metadata.json --interactive

Expected output:

✅ Loaded 261 features × 141 columns

✅ Technical information detected:
   Ionization: ESI-
   Instrument: Orbitrap (Thermo Fisher)
   Chromatography: HILIC (Hydrophilic Interaction)
   MS Level: MS1, MS2
   Dataset IDs: o38905, o39174
   Fragmentation: Stepped NCE (HCD)

✅ Detected column structure:
   identification: 10 columns
   annotation: 15 columns
   group_statistics: 16 columns
   differential_analysis: 28 columns
   area_columns: 33 columns
   normalized_area_columns: 33 columns

✅ Sample classification:
   Blank samples: 7
   QC samples: 9
   Standard samples: 5
   Biological groups: 3
     Groups: Starvation, Untreated, Washout

✅ Metadata saved to metadata.json

User interaction (if needed):

If biological group names are unclear (e.g., "Sample_1", "Sample_2"), ask user to specify groupings
Verify auto-detected blanks and QC samples are correct
User can manually edit metadata.json if classifications are wrong

Tips:

Always use --interactive mode for first-time analysis of a new dataset
Check that blanks and QC samples are correctly identified (critical for QC metrics)
If sample names don't match patterns, user may need to rename columns or edit metadata manually

Step 2: Calculate QC Metrics and Flag Features

Goal: Compute quality metrics and flag features that fail QC thresholds.

Script: scripts/calculate_qc_metrics.py

What it does:

Calculates Pooled QC CV% for each feature (technical reproducibility)
Calculates Biological/Blank ratios for each biological group (signal-to-noise)
Flags features based on FGCZ thresholds:
- flag_high_cv: CV% > 30% (poor reproducibility)
- flag_low_signal: Biological/Blank ratio < 5× (low signal)
- flag_for_review: Fails any criterion
Adds flag reasons as text descriptions
Outputs full feature table with QC metrics appended

FGCZ Quality Thresholds:

CV% cutoff: 30% (adjustable via --cv-threshold)
Blank ratio minimum: 5× (adjustable via --blank-ratio)

Usage:

# Standard FGCZ thresholds
python scripts/calculate_qc_metrics.py input_features.xlsx metadata.json --output qc_results.xlsx

# Save only flagged features (for quick review)
python scripts/calculate_qc_metrics.py input_features.xlsx metadata.json --output flagged_features.xlsx --flagged-only

# Custom thresholds (e.g., exploratory study)
python scripts/calculate_qc_metrics.py input_features.xlsx metadata.json --output qc_results.xlsx --cv-threshold 40 --blank-ratio 3

Expected output:

✅ Loaded 261 features
✅ Loaded metadata with 9 QC samples
Calculating CV% across 9 QC samples...
✅ Calculated 3 biological/blank ratios

============================================================
QC METRICS SUMMARY
============================================================
Total features: 261
Flagged for review: 172 (65.9%)
  - High CV% (>30%): 44
  - Low signal (<5× blank): 165
Passing QC: 89 (34.1%)

✅ Saved 261 features with QC metrics to qc_results.xlsx

Output file structure:

New columns added to original data:

Pooled_QC_CV_percent - CV% across pooled QC samples
[Group]_to_Blank_ratio - Ratio for each biological group (e.g., Untreated_to_Blank_ratio)
min_blank_ratio - Minimum ratio across all biological groups
flag_high_cv - Boolean: CV% > threshold
flag_low_signal - Boolean: Any ratio < threshold
flag_for_review - Boolean: Fails any criterion
flag_reason - Text description of why flagged

Tips:

Typical result: ~60-70% of features flagged (this is expected for untargeted data)
Review flagged features to understand data quality issues
Adjust thresholds if too many/few features flagged (see references/quality_thresholds.md)

Step 3: Resolve Duplicate Annotations

Goal: Identify features with duplicate compound names and recommend which to keep.

Script: scripts/resolve_duplicates.py

What it does:

Identifies features with duplicate compound names (same Name, different RT/m/z)
Ranks duplicates within each group using composite quality score:
- RT consistency (45% weight): Features closer to median RT in group
- Low CV% (45% weight): Lower CV% = better reproducibility
- MS2 match quality (10% weight): Better spectral match (if available)
- NOT used: Peak area (does not distinguish isobaric compounds)
Recommends actions:
- KEEP (best): Highest quality score
- ALTERNATIVE (close score): Within 10% of best (manual review)
- REMOVE: All others

Usage:

# Standard duplicate resolution
python scripts/resolve_duplicates.py qc_results.xlsx metadata.json --output duplicates_resolved.xlsx

Expected output:

✅ Loaded 261 features with QC metrics

✅ Found 38 compounds with duplicates
   Total duplicate features: 179

✅ Ranked 179 duplicate features

============================================================
DUPLICATE RESOLUTION SUMMARY
============================================================
Compounds with duplicates: 38
Total duplicate features: 179

Recommendations:
  KEEP (best): 38
  ALTERNATIVE (close score): 12
  REMOVE: 129

Expected reduction: 129 features removed

Example resolutions (top 5 compounds):

D-(+)-Glucose: 9 entries
  RT=5.73, CV=8.5%, Score=0.842 → KEEP (best)
  RT=5.89, CV=12.3%, Score=0.798 → ALTERNATIVE (close score)
  RT=4.21, CV=45.2%, Score=0.321 → REMOVE
  RT=6.12, CV=38.9%, Score=0.298 → REMOVE
  ...

✅ Saved duplicate resolution to duplicates_resolved.xlsx

Output file structure:

All duplicate features with added columns:

composite_score - Weighted quality score (0-1, higher = better)
rank - Rank within duplicate group (1 = best)
recommendation - Action to take (KEEP, ALTERNATIVE, REMOVE)

Tips:

Review "ALTERNATIVE" recommendations manually (close quality scores)
Use biological knowledge to override recommendations if needed
Typical result: ~70% of duplicate features removed (keep 1-2 best per compound)

Detailed documentation: See references/curation_workflow.md for composite score formula and rationale

Step 4: Generate Interactive QC Report

Goal: Create HTML report with QC visualizations and interactive tables.

Template: assets/fgcz_curation_report_template.Rmd

What it does:

Loads QC results and duplicate resolution tables
Generates interactive HTML report with:
- QC metrics summary: CV% distribution, blank ratio boxplots, feature flag counts
- Duplicate summary: Recommendation breakdown, quality score plots
- Flagged features table: Interactive, filterable table of features for review
- Duplicate details table: Ranked duplicates with scores and recommendations
- Curation recommendations: Suggested next steps
Uses FGCZ templates for consistent styling (ezRun package)
All plots saved as 300 DPI PNG for publication quality

Usage:

# Option 1: Copy template to working directory and customize
cp assets/fgcz_curation_report_template.Rmd ./curation_report.Rmd

# Edit paths in curation_report.Rmd:
# qc_results_path <- "qc_results.xlsx"
# duplicates_path <- "duplicates_resolved.xlsx"

# Render report
module load Dev/R/4.5.0
Rscript -e "rmarkdown::render('curation_report.Rmd')"

# Output: curation_report.html

Expected output:

✅ Loaded 261 features with QC metrics
✅ Loaded 179 duplicate features

Rendering curation_report.Rmd...
Output created: curation_report.html

Report sections:

Overview - Dataset summary, analysis timestamp
QC Metrics Summary - CV% distribution, blank ratios, flags
Duplicate Resolution - Recommendation counts, quality scores
Flagged Features for Manual Review - Interactive table
Recommendations - Suggested curation workflow and export code

Tips:

Use interactive tables (DataTables) to sort/filter flagged features efficiently
Check CV% distribution for instrument performance issues
Verify blank ratios make sense for sample type (plasma, tissue, etc.)
Save HTML report with results for reproducibility

Customization:

Adjust threshold lines in plots if using custom thresholds
Add dataset-specific sections (e.g., pathway enrichment, batch effects)
Modify YAML header to change report title/author

Step 5: Manual Review and Curation Decisions

Goal: User reviews flagged features and duplicate recommendations, makes final curation decisions.

Process (semi-automated):

Review flagged features in interactive HTML table:
- Sort by CV%, blank ratio, or compound name
- Filter by flag reason (high CV, low signal)
- Decide which flagged features to keep vs. remove:
  - Keep if: Biologically important (key pathway metabolite), low abundance but real signal
  - Remove if: Background noise, contaminant, poor annotation confidence
Review duplicate recommendations:
- Verify "KEEP (best)" selections make sense (check RT, CV%, biological relevance)
- Manually inspect "ALTERNATIVE (close score)" cases:
  - Compare RT, MS2 spectra (if available), biological context
  - Choose best representative or keep both if truly different isomers
- Confirm "REMOVE" recommendations (usually safe to remove automatically)
Document decisions:
- Note which flagged features were manually kept (and why)
- Record any duplicate recommendations overridden
- Save curation notes for reproducibility

User workflow:

# In R console or RStudio (after rendering report)

# Load QC results
library(readxl)
library(tidyverse)

qc_results <- read_excel("qc_results.xlsx")
duplicates_ranked <- read_excel("duplicates_resolved.xlsx")

# Example: Review flagged features
flagged <- qc_results |> filter(flag_for_review)
View(flagged)

# Example: Check which duplicates to remove
to_remove <- duplicates_ranked |> filter(recommendation == "REMOVE")
View(to_remove)

# Example: Generate final curated table
curated_features <- qc_results |>
  # Remove flagged features (or apply custom filter)
  filter(!flag_for_review) |>
  # Remove duplicate features (keep only KEEP and ALTERNATIVE)
  anti_join(to_remove, by = "Name")

# Save curated table
write_csv(curated_features, "curated_features.csv")

cat(sprintf("✅ Saved %d curated features (%.1f%% reduction)\n",
            nrow(curated_features),
            100 * (1 - nrow(curated_features) / nrow(qc_results))))

Expected reduction:

Typical: 60-70% of features removed
Example: 261 → 82 features (68.6% reduction)

Quality improvement metrics:

Signal/Noise (biological/blank): ~3× improvement
CV% reproducibility: ~25% improvement
Duplicate features: ~90% reduction

Step 6: Export Final Curated Table

Goal: Generate final curated feature table for downstream analysis.

User workflow:

# Option 1: Fully automated (remove all flagged and duplicate features)
curated_features <- qc_results |>
  filter(!flag_for_review) |>
  anti_join(duplicates_ranked |> filter(recommendation == "REMOVE"), by = "Name")

write_csv(curated_features, "curated_features.csv")

# Option 2: Semi-automated (keep specific flagged features)
# Manually create list of features to keep despite flags
keep_despite_flags <- c("Glutathione", "Taurine", "Hypotaurine")

curated_features <- qc_results |>
  filter(!flag_for_review | Name %in% keep_despite_flags) |>
  anti_join(duplicates_ranked |> filter(recommendation == "REMOVE"), by = "Name")

write_csv(curated_features, "curated_features.csv")

# Option 3: Export separate tables for review
# Passing QC features
passing_qc <- qc_results |> filter(!flag_for_review)
write_csv(passing_qc, "features_passing_qc.csv")

# Flagged for manual review
flagged_features <- qc_results |> filter(flag_for_review)
write_csv(flagged_features, "features_flagged.csv")

# Best duplicate representatives
best_duplicates <- duplicates_ranked |> filter(recommendation == "KEEP (best)")
write_csv(best_duplicates, "best_duplicate_representatives.csv")

Final deliverables:

curated_features.csv - Final filtered feature table
curation_report.html - Interactive QC report with visualizations
metadata.json - Sample classifications
qc_results.xlsx - Full feature table with QC metrics
duplicates_resolved.xlsx - Ranked duplicates with recommendations

Share results:

Upload to gstore: g-req copynow curated_features.csv /srv/gstore/projects/pXXXXX/Analyses_Metabolomics/
Include curation report for reproducibility
Document parameters used (CV% threshold, blank ratio, etc.)

Adjusting Quality Thresholds

Default FGCZ thresholds work for most untargeted metabolomics experiments, but can be adjusted for specific use cases.

When to adjust:

Experiment Type	CV% Threshold	Blank Ratio	Rationale
Standard untargeted	30%	5×	Default FGCZ settings
Targeted metabolomics	20%	10×	Stricter (uses standards)
Exploratory/discovery	40%	3×	More lenient (hypothesis generation)
Plasma/serum	30%	5×	Standard
Urine	35%	4×	Higher variability
Tissue	25%	7×	More homogeneous
Plant extracts	35%	3×	Complex matrix

How to adjust:

# Example: Exploratory study with relaxed thresholds
python scripts/calculate_qc_metrics.py input_features.xlsx metadata.json \
  --output qc_results.xlsx \
  --cv-threshold 40 \
  --blank-ratio 3

# Example: Targeted metabolomics with strict thresholds
python scripts/calculate_qc_metrics.py input_features.xlsx metadata.json \
  --output qc_results.xlsx \
  --cv-threshold 20 \
  --blank-ratio 10

Detailed threshold guidelines: See references/quality_thresholds.md for:

Threshold rationale and calculations
Sample type considerations
Historical FGCZ performance data
When to override automated recommendations

Common Issues and Troubleshooting

Issue: Too many features flagged (>80%)

Possible causes:

Thresholds too stringent for experiment type
Poor overall data quality (instrument issues)
Incorrect sample classifications (blanks/QC misidentified)

Solutions:

Check sample classifications in metadata.json (are blanks correctly identified?)
Relax CV% threshold (try 40% instead of 30%)
Relax blank ratio (try 3× instead of 5×)
Review instrument QC metrics (sensitivity, mass accuracy)

Issue: No duplicates detected despite obvious duplicates

Possible causes:

Compound names differ slightly (e.g., "Glycine" vs "L-Glycine")
Extra characters in names (spaces, special characters)

Solutions:

Normalize compound names before running duplicate resolution
Manually group obvious duplicates in metadata
Use fuzzy string matching for similar names (edit script)

Issue: Duplicate resolution picks wrong feature

Possible causes:

RT or CV% data missing for some features
Composite scoring weights don't match biological priorities

Solutions:

Check RT and CV% columns are populated
Adjust scoring weights in resolve_duplicates.py:
- Increase RT weight if RT is very reliable
- Increase CV% weight if reproducibility is critical
Manually override specific compound recommendations

Issue: R Markdown report won't render

Possible causes:

Missing ezRun package (FGCZ templates)
File paths incorrect in Rmd YAML header
Missing required R packages

Solutions:

Install ezRun: devtools::install_github("uzh/ezRun")
Verify file paths in Rmd (qc_results_path, duplicates_path)
Install required packages: tidyverse, readxl, ggplot2, DT, patchwork
Use alternative template without ezRun (remove FGCZ header/CSS)

Resources

This skill includes bundled resources to support the curation workflow:

scripts/

Python scripts for automated QC calculations and duplicate resolution:

parse_metabolomics_table.py - Load feature table, extract sample metadata, auto-detect sample types
calculate_qc_metrics.py - Compute CV%, blank ratios, flag features for review
resolve_duplicates.py - Rank duplicate annotations by quality scores

All scripts are executable standalone (no Claude context needed). Can be run via command line or SLURM batch jobs.

Dependencies:

Python 3.8+
pandas, numpy, openpyxl (Excel support)
Install: pip install pandas numpy openpyxl

references/

Detailed documentation for manual review and curation decisions:

curation_workflow.md - Complete step-by-step workflow, sample types, QC formulas, decision trees
quality_thresholds.md - FGCZ quality criteria, threshold rationale, adjustment guidelines, historical performance

Load these references when:

User asks for detailed workflow explanations
Need to understand QC metric calculations
Adjusting thresholds for specific experiment types
Troubleshooting curation decisions

assets/

R Markdown template for generating interactive QC reports:

fgcz_curation_report_template.Rmd - HTML report template with FGCZ styling, interactive tables, publication-quality plots

Copy this template to working directory and customize for specific datasets. Follows FGCZ reporting standards (ezRun integration, tabset structure, 300 DPI figures).

Best Practices

Always use interactive mode for new datasets
- Verify sample classifications before calculating QC metrics
- Biological group names vary by experiment (not auto-detectable)
Review flagged features in context
- Don't blindly remove all flagged features
- Use biological knowledge to override automated recommendations
- Document manual curation decisions for reproducibility
Validate duplicate resolution
- Check "KEEP (best)" recommendations make sense (RT, CV%, biological context)
- Manually review "ALTERNATIVE (close score)" cases
- Use MS2 spectra to distinguish true isomers from artifacts
Preserve original data
- Keep uncurated feature table and metadata
- Save curation report and parameters used
- Document thresholds if non-standard
Iterate and refine
- First pass: use default thresholds, review results
- Second pass: adjust thresholds based on data quality
- Final pass: manual review of edge cases

FGCZ Infrastructure Integration

This skill follows FGCZ standards and integrates with existing infrastructure:

Storage:

Working directory: /srv/GT/analysis/pXXXXX/Metabolomics_Curation/
Final deliverables: /srv/gstore/projects/pXXXXX/Analyses_Metabolomics/ (use g-req)

Reporting:

HTML reports use ezRun templates (FGCZ header, CSS)
Follows FGCZ R Markdown standards (tabsets, code folding, 300 DPI figures)

Workflow:

Receive untargeted metabolomics data from FGCZ platform or user upload
Run curation workflow (scripts can be submitted as SLURM jobs if needed)
Generate HTML QC report for user review
Deliver curated feature table and report to gstore

Example batch job:

#!/bin/bash
#SBATCH --job-name=metabolomics_curation
#SBATCH --output=curation_%j.log
#SBATCH --error=curation_%j.err
#SBATCH --time=01:00:00
#SBATCH --mem-per-cpu=8G
#SBATCH --cpus-per-task=4
#SBATCH --partition=employee

module load Dev/R/4.5.0

# Run curation workflow
python scripts/parse_metabolomics_table.py input.xlsx --output metadata.json
python scripts/calculate_qc_metrics.py input.xlsx metadata.json --output qc_results.xlsx
python scripts/resolve_duplicates.py qc_results.xlsx metadata.json --output duplicates_resolved.xlsx

# Generate report
Rscript -e "rmarkdown::render('curation_report.Rmd')"

echo "Curation complete at $(date)"

Last Updated: 2025-01-10

metabolomics-curation

Invocation

Context Preview

Supporting Files

SKILL.md

metabolomics-curation

Invocation

Context Preview

Supporting Files

SKILL.md

Metabolomics Feature Curation

Overview

When to Use This Skill

Workflow Decision Tree

Step 1: Parse Feature Table and Extract Metadata

Step 2: Calculate QC Metrics and Flag Features

Step 3: Resolve Duplicate Annotations

Step 4: Generate Interactive QC Report

Step 5: Manual Review and Curation Decisions

Step 6: Export Final Curated Table

Adjusting Quality Thresholds

Common Issues and Troubleshooting

Issue: Too many features flagged (>80%)

Issue: No duplicates detected despite obvious duplicates

Issue: Duplicate resolution picks wrong feature

Issue: R Markdown report won't render

Resources

scripts/

references/

assets/

Best Practices

FGCZ Infrastructure Integration

Similar Skills

Metabolomics Feature Curation

Overview

When to Use This Skill

Workflow Decision Tree

Step 1: Parse Feature Table and Extract Metadata

Step 2: Calculate QC Metrics and Flag Features

Step 3: Resolve Duplicate Annotations

Step 4: Generate Interactive QC Report

Step 5: Manual Review and Curation Decisions

Step 6: Export Final Curated Table

Adjusting Quality Thresholds

Common Issues and Troubleshooting

Issue: Too many features flagged (>80%)

Issue: No duplicates detected despite obvious duplicates

Issue: Duplicate resolution picks wrong feature

Issue: R Markdown report won't render

Resources

scripts/

references/

assets/

Best Practices

FGCZ Infrastructure Integration

Similar Skills