Structure prediction and design of proteins with noncanonical amino acids.
RareFold predicts single-chain protein structures containing rare noncanonical amino acids and enables the design of novel peptide binders through the EvoBindRare framework.
RareFold supports 49 different amino acid types.
The 20 regular ones, and 29 rare ones:
MSE, TPO, MLY, CME, PTR, SEP,SAH, CSO, PCA, KCX, CAS, CSD, MLZ, OCS, ALY, CSS, CSX, HIC, HYP, YCM, YOF, M3L, PFF, CGU,FTR, LLP, CAF, CMH, MHO
EvoBindRare designs both linear and cyclic binders directly from a protein target sequence, no prior knowledge of binding sites is required. The framework enables rapid and flexible design with expanded chemical diversity through the incorporation of noncanonical amino acids. EvoBindRare has been experimentally validated for both linear and cyclic designs, achieving high-affinity binding in each case.
RareFold is available under the Apache License, Version 2.0.
The RareFold parameters for prediction are made available under the terms of the CC BY 4.0 license.
The design protocol EvoBindRare and the parameters for design are made available under the terms of the CC BY-NC 4.0 license.
You may not use these files except in compliance with the licenses.
It is possible to run EvoBindRare online in the Google colab here
The entire installation takes <1 hour on a standard computer.
We assume you have CUDA12. For CUDA11, you will have to change the installation of some packages.
The runtime will depend on the GPU you have available and the size of the protein you are predicting.
On an NVIDIA A100 GPU, the prediction time is a few minutes on average.
First install miniconda, see: https://docs.conda.io/projects/miniconda/en/latest/miniconda-install.html or https://docs.conda.io/projects/miniconda/en/latest/miniconda-other-installer-links.html
bash install_dependencies.sh
- Install the RareFold environment
- Get the RareFold parameters for single-chain structure prediction
- Get the EvoBindRare parameters for binder design
- Get Uniclust for MSA search
- Install HHblits for MSA search
Run the test case (a few minutes)
conda activate rarefold
bash predict.sh
EvoBindRare designs novel peptide binders based only on a protein target sequence. It is not necessary to specify any target residues within the protein sequence or the length of the binder (although this is possible). EvoBindRare is the only protocol with experimentally verified cyclic-noncanonical amino acid design capacity.
The target structure is shown in green, with canonical peptide residues in blue and noncanonical residues in magenta.
For the original version of EvoBind with regular amino acids, see: https://github.com/patrickbryant1/EvoBind
Run the test case
conda activate rarefold
bash design.sh
If you use RareFold in your research, please cite
Li Q, Daumiller D, Bryant P. RareFold: Structure prediction and design of proteins with noncanonical amino acids. bioRxiv. 2025. p. 2025.05.19.654846. doi:10.1101/2025.05.19.654846 link to paper
https://zenodo.org/records/15818951
Unfortunately the entire train data with MSAs is too large to share through zenodo. This repo mainly contains predictions and metrics.