From tooluniverse
Predicts protein 3D structures from sequence using ESMFold de novo, AlphaFold database retrieval, RCSB experimental structures, ProtVar variant impact, and ProtParam properties.
How this skill is triggered — by the user, by Claude, or both
Slash command
/tooluniverse:tooluniverse-protein-structure-predictionThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
End-to-end workflow for protein structure prediction starting from a sequence or UniProt accession. Combines ESMFold de novo prediction, AlphaFold database retrieval, experimental structure benchmarking from RCSB, ProtVar variant impact assessment, and ProtParam sequence property calculation.
End-to-end workflow for protein structure prediction starting from a sequence or UniProt accession. Combines ESMFold de novo prediction, AlphaFold database retrieval, experimental structure benchmarking from RCSB, ProtVar variant impact assessment, and ProtParam sequence property calculation.
KEY PRINCIPLES:
qualifier parameter (UniProt accession)When uncertain about any scientific fact, SEARCH databases first rather than reasoning from memory. A database-verified answer is always more reliable than a guess.
When analysis requires computation (statistics, data processing, scoring, enrichment), write and run Python code via Bash. Don't describe what you would do — execute it and report actual results. Use ToolUniverse tools to retrieve data, then Python (pandas, scipy, statsmodels, matplotlib) to analyze it.
Apply when users ask:
Not for (use tooluniverse-protein-structure-retrieval instead): retrieval-only tasks where user provides a PDB ID or wants to browse experimental structures without prediction.
| Parameter | Required | Description | Example |
|---|---|---|---|
| sequence | Yes (for ESMFold) | Amino acid sequence (single-letter FASTA) | MVLSPADKTNVK... |
| uniprot_id | Yes (for AlphaFold) | UniProt accession | P04637, P69905 |
| variant | No | Variant notation for structural impact | P04637 R175H, TP53 R175H |
| max_length | No | ESMFold limit: ~800 residues recommended | — |
Phase 0: Input preparation (sequence retrieval if needed)
|
Phase 1: Sequence properties (ProtParam_calculate)
|
Phase 2: De novo prediction (ESMFold_predict_structure)
|
Phase 3: AlphaFold reference (alphafold_get_prediction + alphafold_get_summary)
|
Phase 4: Experimental structure comparison (RCSBAdvSearch_search_structures, RCSBData_get_entry)
|
Phase 5: Variant structural impact (ProtVar_map_variant + ProtVar_get_function) [if variant provided]
|
Phase 6: Quality synthesis and interpretation
Objective: Obtain or verify the protein sequence needed for ESMFold prediction.
Use it directly for ESMFold_predict_structure. Check length:
800 residues: ESMFold may fail or produce lower quality; recommend using AlphaFold instead
Retrieve sequence from UniProt_get_entry_by_accession:
accession: UniProt accessionsequence.value field from the responseNote: If only a name is given (not accession), first resolve with UniProt_search or MyGene_query_genes to get the UniProt accession, then fetch the sequence.
Objective: Calculate physicochemical properties before prediction to contextualize results.
ProtParam_calculate:
sequence: amino acid sequence string (single-letter code)Objective: Predict 3D structure from sequence using Meta's ESM-2 language model.
ESMFold_predict_structure:
sequence: amino acid sequence stringESMFold_predict_structure with the sequence| pLDDT Range | Interpretation | Reliability |
|---|---|---|
| >90 | Very high confidence | Equivalent to experimental quality |
| 70-90 | High confidence | Backbone reliable, side chains approximate |
| 50-70 | Low confidence | Potentially disordered or flexible region |
| <50 | Very low confidence | Likely intrinsically disordered; do not interpret |
| pTM Score | Fold Confidence |
|---|---|
| >0.8 | High confidence global fold |
| 0.5-0.8 | Moderate; some domains may be uncertain |
| <0.5 | Low global fold confidence |
Objective: Retrieve precomputed AlphaFold2 model for comparison and higher-accuracy reference.
alphafold_get_prediction:
qualifier (or alias uniprot_id / uniprot_accession): UniProt accession (e.g., "P04637")alphafold_get_summary:
qualifier (or alias uniprot_id / uniprot_accession): UniProt accessionalphafold_get_annotations (optional):
qualifier: UniProt accessionalphafold_get_prediction and alphafold_get_summaryObjective: Check whether experimental structures exist in PDB and how predictions compare.
RCSBAdvSearch_search_structures (search by protein/gene name):
query: protein name or gene symbollimit: number of results (default 10)RCSBData_get_entry (details for a specific PDB ID):
pdb_id: 4-character PDB identifierObjective: Assess how a specific amino acid substitution affects the predicted structure.
ProtVar_map_variant:
variant: string notation like "P04637 R175H" or HGVS notationProtVar_get_function:
accession: UniProt accessionposition: integer residue positionvariant_aa: mutant amino acid (single letter)ProtVar_map_variant to resolve the variant and confirm positionProtVar_get_function with wild-type position to get domain context| Tier | Evidence |
|---|---|
| T1 | Clinical/functional data for this exact variant (from ProtVar) |
| T2 | Variant at experimentally characterized active site or binding interface |
| T3 | Computational pathogenicity prediction (PolyPhen, SIFT from ProtVar) |
| T4 | Position in predicted structured region only |
Protein summary — name, length, pI, stability index (from ProtParam)
Structure prediction summary table:
| Method | Mean pLDDT | pTM/Global Score | Coverage | Notes |
|---|---|---|---|---|
| ESMFold | X.X | X.X | 100% (full seq) | — |
| AlphaFold | X.X | — | 100% | version vN |
| Experimental (best) | N/A | N/A | XX% | PDB: XXXX, Xray, X.X A |
Confidence map — regions of high vs low confidence; highlight disordered regions
Experimental structure comparison — does PDB have coverage? How does prediction align?
Variant impact (if applicable) — domain context, pathogenicity, structural consequence
Recommendations:
| Tool | Key Parameter | Notes |
|---|---|---|
ESMFold_predict_structure | sequence | Raw amino acid string, no spaces, no FASTA header |
alphafold_get_prediction | qualifier or uniprot_id | UniProt accession (e.g., "P04637") |
alphafold_get_summary | qualifier or uniprot_id | Same UniProt accession |
ProtParam_calculate | sequence | Same sequence string |
ProtVar_map_variant | variant | Format: "<UniProt_ID> <AA><pos><AA>" e.g., "P04637 R175H" |
ProtVar_get_function | position | Integer residue number |
| Situation | Fallback |
|---|---|
| ESMFold fails (sequence too long > 800 aa) | Use AlphaFold model only; note length limitation |
| AlphaFold no entry for UniProt ID | Use ESMFold prediction only |
| RCSB search returns no results | Note no experimental structure; proceed with predictions |
| No UniProt accession available | Use ESMFold from raw sequence; skip AlphaFold |
| ProtVar variant not found | Manually assess position from domain annotation in Phase 4 |
| Database | Coverage | What it provides |
|---|---|---|
| ESMFold | Any protein sequence (up to ~800 aa) | De novo structure prediction from sequence alone |
| AlphaFold DB | UniProt reviewed proteins (>200M entries) | Precomputed predictions with per-residue pLDDT |
| RCSB PDB | ~220,000 experimental structures | Ground-truth experimental coordinates for comparison |
| ProtVar | All UniProt proteins | Variant impact, domain context, clinical annotations |
| ProtParam | Any sequence | Physicochemical sequence properties |
npx claudepluginhub mims-harvard/tooluniverse --plugin tooluniverseRetrieves AI-predicted protein structures from AlphaFold DB by UniProt accession, downloads PDB/mmCIF files, and analyzes confidence metrics (pLDDT, PAE).
Retrieves AlphaFold-predicted protein structures by UniProt ID, downloads PDB/mmCIF files, and analyzes confidence metrics (pLDDT, PAE) for drug discovery and structural biology.
Accesses AlphaFold DB's 200M+ predicted protein structures by UniProt ID using BioPython or REST API. Downloads PDB/mmCIF files, analyzes pLDDT/PAE confidence, bulk-fetches proteomes via Google Cloud.