ProteinMPNN — sequence design from backbone

Upload a backbone PDB, get N candidate sequences with MPNN scores and per-sequence recovery. ~30 s per run.

What it is for

Pick ProteinMPNN when you already have a backbone and need candidate sequences. For de novo backbone generation, use RFantibody, BindCraft, or BoltzGen first and feed the output PDB here.

ProteinMPNN (Dauparas et al., Science 2022). A message-passing graph neural network that scores the 20 canonical residues at every backbone position, conditioned on Cα / backbone coordinates. Sampling at sampling_temp produces candidate sequences that fold into the input geometry.

When it fits:

  • You already have a backbone and need candidate sequences for it.
  • You want to redesign a binder produced by RFdiffusion, RFantibody, BindCraft, or BoltzGen.
  • You want to thread alternative sequences through a curated PDB before ordering.

Inputs

You will need:

  • Backbone PDB / mmCIF — only Cα and backbone atoms are used.
  • Chain ID(s) of the region(s) to redesign. Other chains stay fixed as context.

Each run uses a preset that sets the scale and scope:

Standalone — your backbone
Upload a backbone PDB, pick chain(s) to redesign, get up to 200 candidate sequences. ~30-60 s on A10G-24GB.

Parameters you set on the form:

Chains to design
Which chains in the PDB MPNN should redesign (e.g. A, A B, H L). Other chains are held fixed as context.
Number of sequences
How many independent samples to draw (1 to 200). Each sample is independent; rank by score and ProteinMPNN recovery rate.
Sampling temperature
Lower = more conservative (closer to argmax); higher = more diverse. Defaults to 0.1 per the upstream README.

Typical runtime:

standalone
~1 min

How to read the results

Ranked candidate sequences with per-position score and overall ProteinMPNN recovery, downloadable as FASTA. Pair downstream with AlphaFold2 / ColabFold to confirm the predicted fold.

Where a tool reports them, the scores mean:

ipTM
Predicted confidence in the binder to target interface. Higher is better. Aim above roughly 0.7 on a tractable target.
pLDDT
Per-residue confidence in the predicted fold. Higher means the model is more sure of that part of the structure.
i_pAE and pAE
Predicted alignment error, at the interface (i_pAE) or across the whole structure (pAE). Lower is better.

Try these examples

One-click sample inputs that load straight into the run form. Edit any field before submitting.

Ubiquitin (1ubq)
76-aa monomer benchmark. Fastest MPNN run, classic sequence-recovery test.
Hen egg-white lysozyme (4lzt)
129-aa monomeric enzyme. Tests MPNN recovery on a functional active-site fold.

References

Dauparas et al., Science 2022

Open the ProteinMPNN — sequence design from backbone form All guides