LigandMPNN
- Run
- About
- API Example
LigandMPNN is a deep learning-based protein sequence design method that explicitly models non-protein atomic contexts, including small molecules, nucleotides, and metals. It significantly improves native sequence recovery and side-chain conformation accuracy compared to existing methods like Rosetta and ProteinMPNN. You can use LigandMPNN after RFDiffusion-All Atom to refine protein sequences and optimize interactions with ligands for enhanced binding affinity and specificity.
Example use case:
Designing protein sequences that interact with specific small molecules, nucleotides, or metals to improve binding affinity and specificity for applications in drug discovery, biosensors, and enzyme engineering.
LigandMPNN is significantly more accurate than ProteinMPNN for ligand-aware use cases, such as enzyme design or small molecule binder design.
Technology:
Graph neural networks (GNNs) based on ProteinMPNN, with additional encoding layers for ligand-protein interactions.
Limitations:
- Performance may be limited for compounds with rare or novel chemical elements not well-represented in the training data. Hybrid approaches with physics-based modeling may be needed for low-data regimes.
- Some parameters are kept as default; please check the original GitHub repository for details.
Metrics:
- Sequence recovery near small molecules: 63.3% (vs. 50.4% for Rosetta & ProteinMPNN)
- Sequence recovery near nucleotides: 50.5% (vs. 35.2% & 34.0%)
- Sequence recovery near metals: 77.5% (vs. 36.0% & 40.6%)
- Side-chain chi1 angle recovery: 86.1% (vs. 76.0% for Rosetta)
Mar-10-2025
so you can keep track of your jobs