From claudemol-skills
Use when validating protein designs with AlphaFold2/AlphaFold3/ESMFold predictions, coloring by pLDDT or pAE, computing self-consistency RMSD, or screening design candidates through PyMOL.
How this skill is triggered — by the user, by Claude, or both
Slash command
/claudemol-skills:alphafold-validationThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Workflows for validating designed proteins using structure prediction confidence metrics in PyMOL. Composes with @proteinmpnn-viz (upstream sequence design) and @design-comparison (batch screening).
Workflows for validating designed proteins using structure prediction confidence metrics in PyMOL. Composes with @proteinmpnn-viz (upstream sequence design) and @design-comparison (batch screening).
Send all
cmd.*code via:~/.pymol-agent-bridge/bin/pymol-agent-bridge exec "..."(or heredoc for multi-line). See @pymol-fundamentals for details.
AlphaFold and ESMFold store per-residue confidence (pLDDT) in the B-factor column of output PDBs. Higher = more confident.
cmd.load("af2_prediction.pdb", "af2")
cmd.show("cartoon", "af2")
# Continuous spectrum: blue=high confidence, red=low
cmd.spectrum("b", "red_yellow_green_cyan_blue", "af2", minimum=50, maximum=90)
The official AF color scheme uses discrete confidence bands:
# AlphaFold Database official colors
cmd.color("blue", "af2 and b > 90") # Very high confidence
cmd.color("cornflowerblue", "af2 and b > 70 and b <= 90") # High confidence
cmd.color("yellow", "af2 and b > 50 and b <= 70") # Low confidence
cmd.color("orange", "af2 and b <= 50") # Very low confidence
| pLDDT Range | Color | Interpretation |
|---|---|---|
| > 90 | Blue | Very high confidence — well-folded |
| 70-90 | Cornflower blue | Confident — expect correct fold |
| 50-70 | Yellow | Low confidence — may be disordered or flexible |
| < 50 | Orange | Very low — likely disordered, unstructured |
The gold-standard design validation: does the designed sequence fold back into the intended structure?
cmd.load("rfdiffusion_output.pdb", "design")
cmd.load("af2_prediction.pdb", "af2_pred")
# Align on backbone
rms = cmd.align("af2_pred and name CA", "design and name CA")
print("Self-consistency CA-RMSD: %.2f A" % rms[0])
# Visualize both
cmd.color("cyan", "design")
cmd.color("green", "af2_pred")
cmd.show("cartoon")
# After alignment, color AF2 prediction by deviation from design
cmd.align("af2_pred", "design")
# Use pair_fit to get per-residue distances, or approximate with:
cmd.select("good_match", "af2_pred within 1.5 of design")
cmd.select("poor_match", "af2_pred and not good_match")
cmd.color("green", "good_match")
cmd.color("red", "poor_match")
| Metric | Pass | Marginal | Fail |
|---|---|---|---|
| CA-RMSD (full) | < 1.5 A | 1.5-2.5 A | > 2.5 A |
| CA-RMSD (interface) | < 2.0 A | 2.0-3.0 A | > 3.0 A |
| Mean pLDDT | > 80 | 60-80 | < 60 |
| pLDDT at interface | > 70 | 50-70 | < 50 |
pAE measures the expected position error of residue j when aligned on residue i. Low pAE between domains indicates confident relative positioning.
AlphaFold outputs pAE as a JSON matrix. Low inter-domain pAE indicates the model is confident about the relative arrangement.
# pAE is not directly loadable into PyMOL — it's a 2D matrix
# Useful checks from pAE:
# - Low intra-chain pAE: chain folds well
# - Low inter-chain pAE: confident about binding mode
# - High inter-chain pAE: AF2 is unsure about the complex — design may not bind
# For visualization, focus on pLDDT and RMSD in PyMOL
# pAE is best viewed as a heatmap (matplotlib, ChimeraX, or AF2 notebook)
For designed binders, validate the complex with AF2-Multimer.
cmd.load("af2_multimer_prediction.pdb", "af2_complex")
cmd.show("cartoon")
chains = cmd.get_chains("af2_complex")
# Typically: target=chain A, binder=chain B
cmd.color("gray70", "chain A") # target
cmd.color("marine", "chain B") # designed binder
# Check confidence specifically at the binding interface
cmd.select("interface_A", "chain A and byres (chain A within 5 of chain B)")
cmd.select("interface_B", "chain B and byres (chain B within 5 of chain A)")
# Color interface by pLDDT
cmd.spectrum("b", "red_yellow_green_cyan_blue", "interface_A", minimum=50, maximum=90)
cmd.spectrum("b", "red_yellow_green_cyan_blue", "interface_B", minimum=50, maximum=90)
cmd.show("sticks", "interface_A or interface_B")
# Count interface contacts
n_contacts = cmd.count_atoms("chain B and byres (chain B within 4 of chain A) and name CA")
print("Interface residues (binder): %d" % n_contacts)
# H-bonds across interface
cmd.distance("interface_hbonds", "chain A", "chain B", mode=2)
# Count H-bonds from the distance object (each bond = 2 pseudoatoms)
n_hbonds = cmd.count_atoms("interface_hbonds") // 2
print("Interface H-bonds: ~%d" % n_hbonds)
Screen many designs at once using pLDDT and RMSD.
import glob, os
pred_dir = "/path/to/af2_predictions"
design_dir = "/path/to/rfdiffusion_outputs"
results = []
for f in sorted(glob.glob(os.path.join(pred_dir, "*.pdb"))):
name = os.path.splitext(os.path.basename(f))[0]
cmd.load(f, name)
# Mean pLDDT
stored.bfactors = []
cmd.iterate("%s and name CA" % name, "stored.bfactors.append(b)")
mean_plddt = sum(stored.bfactors) / len(stored.bfactors) if stored.bfactors else 0
# Self-consistency RMSD (if design file exists)
design_file = os.path.join(design_dir, name.replace("_pred", "") + ".pdb")
if os.path.exists(design_file):
cmd.load(design_file, "tmp_design")
rms = cmd.align("%s and name CA" % name, "tmp_design and name CA")
rmsd = rms[0]
cmd.delete("tmp_design")
else:
rmsd = -1
results.append((name, mean_plddt, rmsd))
print("%s: pLDDT=%.1f, RMSD=%.2f" % (name, mean_plddt, rmsd))
# Summary
passing = [(n, p, r) for n, p, r in results if p > 80 and 0 < r < 1.5]
print("\nPassing designs: %d / %d" % (len(passing), len(results)))
# After batch loading, apply pLDDT coloring to all and tile
for obj in cmd.get_object_list():
cmd.spectrum("b", "red_yellow_green_cyan_blue", obj, minimum=50, maximum=90)
cmd.set("grid_mode", 1)
cmd.show("cartoon")
ESMFold is ~60x faster than AF2 — ideal for rapid first-pass screening of hundreds of designs.
# ESMFold outputs are identical in format (pLDDT in B-factor)
# Same coloring commands apply:
cmd.load("esmfold_prediction.pdb", "esm")
cmd.spectrum("b", "red_yellow_green_cyan_blue", "esm", minimum=50, maximum=90)
cmd.show("cartoon")
cmd.load("esmfold_pred.pdb", "esm_pred")
cmd.load("af2_pred.pdb", "af2_pred")
cmd.align("esm_pred and name CA", "af2_pred and name CA")
rms = cmd.rms_cur("esm_pred and name CA", "af2_pred and name CA")
print("ESMFold vs AF2 RMSD: %.2f A" % rms)
cmd.color("salmon", "esm_pred")
cmd.color("marine", "af2_pred")
spectrum b is your primary toolnpx claudepluginhub anaka/claudemol --plugin claudemol-skillsAccesses AlphaFold DB's 200M+ predicted protein structures by UniProt ID using BioPython or REST API. Downloads PDB/mmCIF files, analyzes pLDDT/PAE confidence, bulk-fetches proteomes via Google Cloud.
Designs therapeutic proteins using RFdiffusion backbone generation, ProteinMPNN sequence optimization, and structure validation with ESMFold/AlphaFold2. Useful for protein binders, scaffolds, enzyme variants, and miniprotein design.
Access AlphaFold 200M+ predicted protein structures by UniProt ID, download PDB/mmCIF files, and analyze confidence metrics (pLDDT, PAE) for drug discovery and structural biology.