ProteinMPNN

Sequence design from a backbone. Upload a backbone PDB, get N candidate sequences with MPNN scores and per-sequence recovery. ~30 s per run.

What it is for

Pick ProteinMPNN when you already have a backbone and need candidate sequences. For de novo backbone generation, use RFantibody, BindCraft, or BoltzGen first and feed the output PDB here.

ProteinMPNN (Dauparas et al., Science 2022). A message-passing graph neural network that scores the 20 canonical residues at every backbone position, conditioned on Cα / backbone coordinates. Sampling at sampling_temp produces candidate sequences that fold into the input geometry.

When it fits:

You already have a backbone and need candidate sequences for it.
You want to redesign a binder produced by RFdiffusion, RFantibody, BindCraft, or BoltzGen.
You want to thread alternative sequences through a curated PDB before ordering.

Inputs

You will need:

Backbone PDB or mmCIF (only Cα and backbone atoms are used).
Chain ID(s) of the region(s) to redesign. Other chains stay fixed as context.

Each run uses a preset that sets the scale and scope:

Standalone with your backbone: Upload a backbone PDB, pick chain(s) to redesign, get up to 1000 candidate sequences. ~30 to 60 s on A10G-24GB.

Parameters you set on the form:

Chains to design: Which chains in the PDB MPNN should redesign (e.g. A, A B, H L). Other chains are held fixed as context.
Number of sequences: How many independent samples to draw (1 to 1000). Each sample is independent; rank by score and ProteinMPNN recovery rate.
Sampling temperature: Lower means more conservative (closer to argmax); higher means more diverse. Defaults to 0.1 per the upstream README.

Typical runtime:

standalone: ~1 min

How to read the results

Ranked candidate sequences with per-position score and overall ProteinMPNN recovery, downloadable as FASTA. Pair downstream with AlphaFold2 or ColabFold to confirm the predicted fold.

Where a tool reports them, the scores mean:

ipTM: Predicted confidence in the binder to target interface. Higher is better. Aim above roughly 0.7 on a tractable target.
pLDDT: Per-residue confidence in the predicted fold. Higher means the model is more sure of that part of the structure.
i_pAE and pAE: Predicted alignment error, at the interface (i_pAE) or across the whole structure (pAE). Lower is better.

References

Dauparas et al., Science 2022

Read the paper Source code on GitHub

Open the ProteinMPNN form All guides