ProteinMPNN — sequence design from backbone
Upload a backbone PDB, get N candidate sequences with MPNN scores and per-sequence recovery. ~30 s per run.
What it is for
Pick ProteinMPNN when you already have a backbone and need candidate sequences. For de novo backbone generation, use RFantibody, BindCraft, or BoltzGen first and feed the output PDB here.
ProteinMPNN (Dauparas et al., Science 2022). A message-passing graph neural network that scores the 20 canonical residues at every backbone position, conditioned on Cα / backbone coordinates. Sampling at sampling_temp produces candidate sequences that fold into the input geometry.
When it fits:
- You already have a backbone and need candidate sequences for it.
- You want to redesign a binder produced by RFdiffusion, RFantibody, BindCraft, or BoltzGen.
- You want to thread alternative sequences through a curated PDB before ordering.
Inputs
You will need:
- Backbone PDB / mmCIF — only Cα and backbone atoms are used.
- Chain ID(s) of the region(s) to redesign. Other chains stay fixed as context.
Each run uses a preset that sets the scale and scope:
- Standalone — your backbone
- Upload a backbone PDB, pick chain(s) to redesign, get up to 200 candidate sequences. ~30-60 s on A10G-24GB.
Parameters you set on the form:
- Chains to design
- Which chains in the PDB MPNN should redesign (e.g.
A,A B,H L). Other chains are held fixed as context. - Number of sequences
- How many independent samples to draw (1 to 200). Each sample is independent; rank by score and ProteinMPNN recovery rate.
- Sampling temperature
- Lower = more conservative (closer to argmax); higher = more diverse. Defaults to 0.1 per the upstream README.
Typical runtime:
- standalone
- ~1 min
How to read the results
Ranked candidate sequences with per-position score and overall ProteinMPNN recovery, downloadable as FASTA. Pair downstream with AlphaFold2 / ColabFold to confirm the predicted fold.
Where a tool reports them, the scores mean:
- ipTM
- Predicted confidence in the binder to target interface. Higher is better. Aim above roughly 0.7 on a tractable target.
- pLDDT
- Per-residue confidence in the predicted fold. Higher means the model is more sure of that part of the structure.
- i_pAE and pAE
- Predicted alignment error, at the interface (i_pAE) or across the whole structure (pAE). Lower is better.
Try these examples
One-click sample inputs that load straight into the run form. Edit any field before submitting.
- Ubiquitin (1ubq)
- 76-aa monomer benchmark. Fastest MPNN run, classic sequence-recovery test.
- Hen egg-white lysozyme (4lzt)
- 129-aa monomeric enzyme. Tests MPNN recovery on a functional active-site fold.
References
Dauparas et al., Science 2022