From tooluniverse
Designs therapeutic proteins using RFdiffusion backbone generation, ProteinMPNN sequence optimization, and structure validation with ESMFold/AlphaFold2. Useful for protein binders, scaffolds, enzyme variants, and miniprotein design.
How this skill is triggered — by the user, by Claude, or both
Slash command
/tooluniverse:tooluniverse-protein-therapeutic-designThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
AI-guided de novo protein design using RFdiffusion backbone generation, ProteinMPNN sequence optimization, and structure validation for therapeutic protein development.
AI-guided de novo protein design using RFdiffusion backbone generation, ProteinMPNN sequence optimization, and structure validation for therapeutic protein development.
KEY PRINCIPLES:
Therapeutic protein design starts with the target interaction. What binding surface do you need to cover? A small pocket = nanobody or peptide. A large flat surface = designed protein. Stability, immunogenicity, and manufacturability constrain the design space.
When uncertain about any scientific fact, SEARCH databases first rather than reasoning from memory. A database-verified answer is always more reliable than a guess.
When analysis requires computation (statistics, data processing, scoring, enrichment), write and run Python code via Bash. Don't describe what you would do — execute it and report actual results. Use ToolUniverse tools to retrieve data, then Python (pandas, scipy, statsmodels, matplotlib) to analyze it.
Apply when user asks to:
Phase 1: Target Characterization
Get structure (PDB, EMDB cryo-EM, AlphaFold), identify binding epitope
Phase 2: Backbone Generation (RFdiffusion)
Define constraints, generate >= 5 backbones, filter by geometry
Phase 3: Sequence Design (ProteinMPNN)
Design >= 8 sequences per backbone, sample with temperature control
Phase 4: Structure Validation (ESMFold/AlphaFold2)
Predict structure, compare to backbone, assess pLDDT/pTM
Phase 5: Developability Assessment
Aggregation, pI, expression prediction
Phase 6: Report Synthesis
Ranked candidates, FASTA, experimental recommendations
[TARGET]_protein_design_report.md first with section headers[TARGET]_designed_sequences.fasta and [TARGET]_top_candidates.csvEvery design MUST include: Sequence, Length, Target, Method, and Quality Metrics (pLDDT, pTM, MPNN score, binding prediction).
| Tool | Purpose | Key Parameter |
|---|---|---|
NvidiaNIM_rfdiffusion | Backbone generation | diffusion_steps (NOT num_steps) |
NvidiaNIM_proteinmpnn | Sequence design | pdb_string (NOT pdb) |
ESMFold_predict_structure | Fast validation | sequence (NOT seq) |
NvidiaNIM_alphafold2 | High-accuracy validation | sequence, algorithm |
NvidiaNIM_esm2_650m | Sequence embeddings | sequences, format |
| Tool | Wrong | Correct |
|---|---|---|
NvidiaNIM_rfdiffusion | num_steps=50 | diffusion_steps=50 |
NvidiaNIM_proteinmpnn | pdb=content | pdb_string=content |
ESMFold_predict_structure | seq="MVLS..." | sequence="MVLS..." |
NvidiaNIM_alphafold2 | seq="MVLS..." | sequence="MVLS..." |
NVIDIA_API_KEY environment variable required| Tool | Purpose | Key Parameters |
|---|---|---|
PDBe_get_uniprot_mappings | Find PDB structures | uniprot_id |
RCSBData_get_entry | Download PDB file | pdb_id |
alphafold_get_prediction | Get AlphaFold DB structure | accession |
emdb_search | Search cryo-EM maps | query |
emdb_get_entry | Get entry details | entry_id |
UniProt_get_entry_by_accession | Get target sequence | accession |
InterPro_get_protein_domains | Get domains | accession |
| Tier | Criteria |
|---|---|
| T1 (best) | pLDDT >85, pTM >0.8, low aggregation, neutral pI |
| T2 | pLDDT >75, pTM >0.7, acceptable developability |
| T3 | pLDDT >70, pTM >0.65, developability concerns |
| T4 | Failed validation or major developability issues |
npx claudepluginhub mims-harvard/tooluniverse --plugin tooluniverseGenerates and analyzes protein sequences, structures, and functions using ESM3 (generative multimodal design) and ESM C (embeddings). Supports local models and cloud-based Forge API for protein engineering tasks.
Generates protein sequences, predicts 3D structure, performs inverse folding, and extracts embeddings using ESM3 and ESM C models. Works locally on GPU or via EvolutionaryScale Forge API.
Runs ESM protein language models (ESM3 for generative design, ESM C for embeddings) locally or via Forge API. Use for protein sequence generation, structure prediction, inverse folding, and protein engineering.