LigandMPNN

Dauparas, J. et al. |

LigandMPNN

Protein Design

Run
About
API Example

Step 1: Upload your data

Upload Backbone PDB File

Drag your file(s) or upload

Your file can be in the following formats:pdb

The Protein Data Bank (PDB) format is the standard file format for storing atomic coordinates and other information about biomolecules. It contains details about protein and nucleic acid structures including atomic coordinates, crystallographic structure factors, NMR experimental data, and metadata about the structure.

Don’t have a file?

Use our demo data to run

Use Demo Data

Upload LigandMPNN Checkpoint File (optional)

Drag your file(s) or upload

Your file can be in the following formats:pt

The PyTorch model file is a file format used to store information about the three-dimensional structures of biological macromolecules.

Don’t have a file?

Use our demo data to run

Use Demo Data

Step 2: Set Parameters

Temperature

0.100

1.000

Batch Size

Chains to Design

Fixed Residues

Redesigned Residues

Bias AA

Bias AA per Residue

Omit AA

Omit AA per Residue

Symmetry Residues

Symmetry Weights

Fixed Residues Multi

Redesigned Residues Multi

Omit AA per Residue Multi

Bias AA per Residue Multi

Parse atoms with zero occupancy

Force hetatm

Save Stats

Homo Oligomer

LigandMPNN use atom context

LigandMPNN use side chain context

Step 3: Complete run profile

Job name - Optional

LigandMPNN is a deep learning-based protein sequence design method that explicitly models non-protein atomic contexts, including small molecules, nucleotides, and metals. It significantly improves native sequence recovery and side-chain conformation accuracy compared to existing methods like Rosetta and ProteinMPNN. You can use LigandMPNN after RFDiffusion-All Atom to refine protein sequences and optimize interactions with ligands for enhanced binding affinity and specificity.

Example use case:

Designing protein sequences that interact with specific small molecules, nucleotides, or metals to improve binding affinity and specificity for applications in drug discovery, biosensors, and enzyme engineering.

LigandMPNN is significantly more accurate than ProteinMPNN for ligand-aware use cases, such as enzyme design or small molecule binder design.

Technology:

Graph neural networks (GNNs) based on ProteinMPNN, with additional encoding layers for ligand-protein interactions.

Limitations:

Performance may be limited for compounds with rare or novel chemical elements not well-represented in the training data. Hybrid approaches with physics-based modeling may be needed for low-data regimes.
Some parameters are kept as default; please check the original GitHub repository for details.

Metrics:

Sequence recovery near small molecules: 63.3% (vs. 50.4% for Rosetta & ProteinMPNN)
Sequence recovery near nucleotides: 50.5% (vs. 35.2% & 34.0%)
Sequence recovery near metals: 77.5% (vs. 36.0% & 40.6%)
Side-chain chi1 angle recovery: 86.1% (vs. 76.0% for Rosetta)

Citation:

Dauparas, J., Lee, G.R., Pecoraro, R., An, L., Anishchenko, I.V., Glasscock, C.J., & Baker, D. (2023). Atomic context-conditioned protein sequence design using LigandMPNN. bioRxiv.

Released:
Mar-10-2025

Previous Job Parameters

Your previous job parameters will show up here
so you can keep track of your jobs