Skill

foundry

Foundry protein design toolkit reference. Auto-invoked when the user discusses protein design, structure prediction, inverse folding, RFdiffusion3, RosettaFold3, ProteinMPNN, LigandMPNN, SolubleMPNN, Enhanced MPNN, or running Foundry tools on GPUs.

Popularity

Stars

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/foundry:foundry

User invocable

Model invocable

Inline context

Default effort

Tool Access

This skill is limited to the following tools:

BashReadWriteGlobGrep

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Foundry provides core protein design and prediction tools. All tools require GPU compute and are designed to run on Vast.ai instances using a pre-built Docker image that includes all tools, checkpoints, and dependencies.

SKILL.md

361 lines · ~3.4k tokens

Stats

Stars1

MaintenanceGood

Last CommitFeb 7, 2026

Actions

View Source View Plugin View on GitHub View README

Stats

Actions

Foundry Protein Design Toolkit Reference

Tools Overview

Tool	CLI Command	Purpose	GPU VRAM
RFdiffusion3 (RFD3)	`rfd3 design`	All-atom generative protein structure design	24-48GB+
RosettaFold3 (RF3)	`rf3 fold`	Biomolecular structure prediction	24-48GB+
ProteinMPNN/LigandMPNN	`mpnn`	Fixed-backbone inverse folding sequence design	16-24GB
Enhanced MPNN	`mpnn`	Designability-optimized inverse folding (fine-tuned LigandMPNN)	16-24GB

Docker Image

A pre-built Docker image based on vastai/base-image:cuda-12.4.1-auto is available with everything pre-installed:

Python 3.12, PyTorch with CUDA 12.4
Foundry with all model extras (rc-foundry[all])
All checkpoints pre-downloaded to /root/.foundry/checkpoints
Enhanced MPNN weights included

When launching Vast.ai instances, use this image with:

vastai create instance <OFFER_ID> --image <FOUNDRY_DOCKER_IMAGE> --disk 64 --ssh

No --onstart-cmd is needed — the image is ready to use immediately. The working directory is /workspace.

Available Checkpoints

All checkpoints are pre-installed in the Docker image at /root/.foundry/checkpoints:

Name	File	Tool
`rfd3`	`rfd3_latest.ckpt`	RFdiffusion3
`rf3`	`rf3_foundry_01_24_latest_remapped.ckpt`	RosettaFold3 (latest, recommended)
`rf3_preprint_921`	`rf3_foundry_09_21_preprint_remapped.ckpt`	RF3 (benchmark, 09/21 cutoff)
`rf3_preprint_124`	`rf3_foundry_01_24_preprint_remapped.ckpt`	RF3 (preprint)
`proteinmpnn`	`proteinmpnn_v_48_020.pt`	ProteinMPNN
`ligandmpnn`	`ligandmpnn_v_32_010_25.pt`	LigandMPNN
`solublempnn`	`solublempnn_v_48_020.pt`	SolubleMPNN
`enhanced_mpnn`	`enhanced_mpnn_step_80000.pt`	Enhanced MPNN

Checkpoint env: FOUNDRY_CHECKPOINT_DIRS=/root/.foundry/checkpoints

Tool 1: RFdiffusion3 (RFD3)

All-atom generative model for designing protein structures under complex constraints: protein binders, enzyme active sites, nucleic acid binders, small molecule binders, symmetric assemblies.

CLI

rfd3 design out_dir=<OUTPUT_DIR> inputs=<INPUT_JSON> [OPTIONS]

Key Parameters

Parameter	Default	Description
`out_dir`	required	Output directory
`inputs`	required	Input JSON/YAML file
`ckpt_path`	`rfd3`	Checkpoint (auto-resolved from registry)
`skip_existing`	`True`	Skip existing outputs
`diffusion_batch_size`	`8`	Designs per batch
`n_batches`	`1`	Number of batches
`dump_trajectories`	`False`	Save denoising trajectory (large files)
`prevalidate_inputs`	`False`	Validate inputs before loading model
`low_memory_mode`	`False`	Memory-efficient tokenization

Sampler Parameters

Parameter	Default	Description
`inference_sampler.num_timesteps`	`200`	Diffusion steps
`inference_sampler.step_scale`	`1.5`	Diversity vs designability tradeoff
`inference_sampler.noise_scale`	`1.003`	Noise scaling
`inference_sampler.use_classifier_free_guidance`	`False`	Enable CFG
`inference_sampler.cfg_scale`	`1.5`	CFG scale (if enabled)
`inference_sampler.kind`	`default`	`default` or `symmetry`

Input JSON Format

The input is a JSON file mapping example names to design specifications:

{
    "design_name": {
        "input": "./path/to/target.pdb",
        "contig": "40-120,/0,A1-100",
        "length": "140-160",
        "ligand": "NAI,ACT",
        "unindex": "A108,A139",
        "select_fixed_atoms": {
            "A108": "ND2,CG",
            "A139": "OG,CB,CA"
        },
        "select_hotspots": {
            "E64": "CD2,CZ"
        },
        "is_non_loopy": true,
        "infer_ori_strategy": "hotspots",
        "dialect": 2
    }
}

Key input fields:

input: Path to target structure (PDB/CIF)
contig: Contig specification — defines which residues to keep/design and chain breaks
length: Length range for designed protein (e.g., "140-160")
ligand: Comma-separated ligand residue names to include
unindex: Residues to unindex (make designable in position)
select_fixed_atoms: Per-residue atom selections to fix
select_hotspots: Target hotspot residues for interface design
is_non_loopy: Disable loop-only design mode
infer_ori_strategy: How to determine orientation (hotspots)
dialect: Input dialect version (use 2 for latest)
partial_t: Partial diffusion timestep (for refinement)
ori_token: Orientation token indices

Output

{name}.cif — Designed all-atom structures
{name}.json — Full design metadata
Trajectory files (if dump_trajectories=True)

Tool 2: RosettaFold3 (RF3)

All-atom biomolecular structure prediction for proteins, nucleic acids, ligands, and complexes.

CLI

rf3 fold inputs=<INPUT_FILE_OR_DIR> [OPTIONS]

Key Parameters

Parameter	Default	Description
`inputs`	required	JSON, CIF, PDB file, list, or directory
`out_dir`	`./`	Output directory
`ckpt_path`	auto	Checkpoint path
`n_recycles`	`10`	Number of recycling iterations
`diffusion_batch_size`	`5`	Number of output structures
`num_steps`	`200`	Diffusion sampling steps (50 is faster, similar quality)
`early_stopping_plddt_threshold`	`0.5`	Skip low-confidence predictions
`seed`	`null`	Random seed
`dump_trajectories`	`False`	Save denoising trajectories
`skip_existing`	`False`	Skip existing predictions
`one_model_per_file`	`False`	Separate files per model
`annotate_b_factor_with_plddt`	`False`	pLDDT as B-factors
`template_noise_scale`	`1e-5`	Template noise

Structural Control

template_selection — AtomSelection syntax for template regions (e.g., "[A, B/*/1-42]")
ground_truth_conformer_selection — Fix ligand conformations (e.g., "[C, D]")
cyclic_chains — List of chain IDs to cyclize

Input JSON Format

[
    {
        "name": "example_prediction",
        "components": [
            {
                "seq": "MTSENPLLALREK...",
                "chain_id": "A",
                "msa_path": "path/to/protein.a3m"
            },
            {
                "ccd_code": "MG"
            },
            {
                "smiles": "[nH]1cc[nH+]c1"
            },
            {
                "path": "path/to/ligand.sdf"
            }
        ],
        "template_selection": ["A"],
        "ground_truth_conformer_selection": ["C"]
    }
]

Component types:

seq + optional msa_path — Protein/nucleic acid sequence (supports non-canonical: (PTM))
ccd_code — CCD compound code (e.g., MG, NAG)
smiles — Small molecule SMILES string
path — Structure file (CIF, PDB, SDF)

AtomSelection syntax: CHAIN/RES_NAME/RES_ID/ATOM_NAME

A — all atoms in chain A
A/*/5-10 — residues 5-10 in chain A
B/*/1-42, B/*/49-63 — multiple regions (CDR framework)

Output

{name}_metrics.csv — Overall confidence metrics (pTM, pLDDT, ipTM)
{name}.score — Granular per-atom metrics
{name}_model_0.cif.gz ... {name}_model_N.cif.gz — Predicted structures

Tool 3: ProteinMPNN / LigandMPNN / SolubleMPNN / Enhanced MPNN

Lightweight inverse-folding models for fixed-backbone protein sequence design.

CLI

mpnn --model_type <MODEL_TYPE> --structure_path <STRUCTURE> [OPTIONS]

Or from JSON config:

mpnn --config_json config.json

Model Variants

Variant	model_type	Checkpoint	Use Case
ProteinMPNN	`protein_mpnn`	`proteinmpnn_v_48_020.pt`	Standard protein sequence design
LigandMPNN	`ligand_mpnn`	`ligandmpnn_v_32_010_25.pt`	Design around small molecules, DNA, ions
SolubleMPNN	`protein_mpnn`	`solublempnn_v_48_020.pt`	Solubility-optimized design
Enhanced MPNN	`ligand_mpnn`	`enhanced_mpnn_step_80000.pt`	Designability-optimized (highest success rate)

Enhanced MPNN

Enhanced MPNN is a fine-tuned LigandMPNN trained with ResiDPO (Residue-level Designability Preference Optimization). Instead of optimizing for native sequence recovery like standard MPNN, it directly optimizes for designability — whether designed sequences fold into the target structure with high confidence (as measured by AlphaFold2 pLDDT scores).

Performance improvements over standard LigandMPNN:

Enzyme design: ~2.7x higher success rate (17.6% vs 6.6%)
Binder design: ~2.3x higher success rate (16.1% vs 7.1%)
Makes previously "undesignable" backbones designable — doubles the fraction of viable backbone scaffolds

How it works:

ResiDPO decouples preference learning from KL regularization at the residue level
Residue-level Preference Learning (RPL) improves positions where pLDDT is low
Residue-level Constraint Learning (RCL) preserves knowledge at positions already working well
Trained on 19k PDB structures with AF2 pLDDT as the reward signal

When to use Enhanced MPNN (recommended for most de novo design):

Binder or enzyme design pipelines (RFD3 → MPNN → RF3 validation)
Any scenario where standard MPNN yields low designability/validation rates
Maximizing the fraction of designs that pass structure validation

When to prefer standard MPNN/LigandMPNN:

When sequence recovery (similarity to native) matters more than designability
Benchmarking against published results using original models

Usage — same CLI as LigandMPNN, just swap the checkpoint:

mpnn --model_type ligand_mpnn \
  --checkpoint_path /root/.foundry/checkpoints/enhanced_mpnn_step_80000.pt \
  --structure_path backbone.cif \
  --temperature 0.1 \
  --number_of_batches 8

Key Parameters

Global:

--model_type — protein_mpnn or ligand_mpnn
--checkpoint_path — Path to model weights
--is_legacy_weights — Set True for original repository weights (not needed for Enhanced MPNN)
--out_directory — Output directory
--write_fasta — Write FASTA output (default: True)
--write_structures — Write designed structures (default: True)

Per-input:

--structure_path — Input structure (CIF/PDB)
--batch_size — Sequences per batch (default: 1)
--number_of_batches — Number of batches (default: 1)
--temperature — Sampling temperature, controls diversity (default: 0.1)
--seed — Random seed
--designed_chains — Chains to redesign
--fixed_chains — Chains to keep fixed
--designed_residues — Specific residues to design
--fixed_residues — Specific residues to fix

Advanced:

--omit — Amino acids to exclude (default: ["UNK"])
--bias — Per-residue logit bias
--structure_noise — Noise level (default: 0.0)
--symmetry_residues — Residues for symmetric design
--homo_oligomer_chains — Homo-oligomer chains

JSON Config Format

{
    "checkpoint_path": "enhanced_mpnn_step_80000.pt",
    "model_type": "ligand_mpnn",
    "out_directory": "./outputs/",
    "inputs": [
        {
            "structure_path": "complex.pdb",
            "name": "example",
            "seed": 42,
            "batch_size": 1,
            "number_of_batches": 5,
            "temperature": 0.1,
            "fixed_chains": ["A"],
            "designed_chains": ["B"]
        }
    ]
}

Output

{name}_sequences_*.fasta — Designed sequences
{name}_*.cif — Designed structures (if write_structures=True)

Common Design Workflows

Protein Binder Design Pipeline

RFD3: Generate binder backbones targeting a protein → outputs CIF structures
Enhanced MPNN: Design sequences for the generated backbones (~2.3x better success) → outputs FASTA sequences
RF3: Validate designed sequences fold correctly → outputs confidence metrics

Enzyme Design Pipeline

RFD3: Design enzyme scaffolds around a ligand/substrate
Enhanced MPNN: Design ligand-aware sequences (~2.7x better designability) → outputs FASTA sequences
RF3: Validate predicted structure matches design

Sequence Optimization

Start with an existing structure (PDB/CIF)
ProteinMPNN/SolubleMPNN: Redesign sequences for stability/solubility
RF3: Predict structure of redesigned sequences

Vast.ai GPU Recommendations

Tool	Min VRAM	Recommended GPU	Notes
RFD3	24 GB	A100 40GB, RTX 4090	Large designs need 48GB+
RF3	24 GB	A100 40GB, RTX 4090	Multi-chain complexes need more
MPNN (all variants)	8 GB	RTX 4090, RTX 3090	Very lightweight

Use the pre-built Foundry Docker image — no setup commands needed. Instance is ready to run immediately after launch.

foundry

Popularity

Invocation

Tool Access

Context Preview

SKILL.md

foundry

Popularity

Invocation

Tool Access

Context Preview

SKILL.md

Foundry Protein Design Toolkit Reference

Tools Overview

Docker Image

Available Checkpoints

Tool 1: RFdiffusion3 (RFD3)

CLI

Key Parameters

Sampler Parameters

Input JSON Format

Output

Tool 2: RosettaFold3 (RF3)

CLI

Key Parameters

Structural Control

Input JSON Format

Output

Tool 3: ProteinMPNN / LigandMPNN / SolubleMPNN / Enhanced MPNN

CLI

Model Variants

Enhanced MPNN

Key Parameters

JSON Config Format

Output

Common Design Workflows

Protein Binder Design Pipeline

Enzyme Design Pipeline

Sequence Optimization

Vast.ai GPU Recommendations

Similar Skills

Foundry Protein Design Toolkit Reference

Tools Overview

Docker Image

Available Checkpoints

Tool 1: RFdiffusion3 (RFD3)

CLI

Key Parameters

Sampler Parameters

Input JSON Format

Output

Tool 2: RosettaFold3 (RF3)

CLI

Key Parameters

Structural Control

Input JSON Format

Output

Tool 3: ProteinMPNN / LigandMPNN / SolubleMPNN / Enhanced MPNN

CLI

Model Variants

Enhanced MPNN

Key Parameters

JSON Config Format

Output

Common Design Workflows

Protein Binder Design Pipeline

Enzyme Design Pipeline

Sequence Optimization

Vast.ai GPU Recommendations

Similar Skills