You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
SynLlama: Generating Synthesizable Molecules and Their Analogs with Large Language Models 🧬
📖 Overview
SynLlama is a fine-tuned version of Meta's Llama3 large language models that generates synthesizable analogs of small molecules by creating full synthetic pathways using commonly accessible building blocks and robust organic reaction templates, offering a valuable tool for drug discovery with strong performance in bottom-up synthesis, synthesizable analog generation, and hit expansion.
💡 Usage
Prerequisites
Ensure you have conda installed on your system. All additional dependencies will be managed via the environment.yml file.
To perform inference using the already trained SynLlama, download the trained models and relevant files from here and follow the instructions in the Inference Guide.
Retraining
If you are interested in retraining the model, please refer to the Retraining Guide for detailed instructions.
📄 License
This project is licensed under the MIT License - see the LICENSE file for details
🙏 Acknowledgments
This project is built on top of the ChemProjector Repo. We thank the authors for building such a user-friendly github!
📝 Citation
If you use this code in your research, please cite:
@misc{sun_synllama_2025,
title = {SynLlama: Generating Synthesizable Molecules and Their Analogs with Large Language Models},
url = {https://arxiv.org/abs/2503.12602},
doi = {10.48550/arXiv.2503.12602},
publisher = {arXiv},
author = {Sun, Kunyang and Bagni, Dorian and Cavanagh, Joseph M. and Wang, Yingze and Sawyer, Jacob M. and Gritsevskiy, Andrew and Head-Gordon, Teresa},
month = mar,
year = {2025}
}