You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is the official repository of the paper accepted by COLM25: FormaRL: Enhancing Autoformalization with no Labeled Data.
Camera ready version of this paper and code implementation are coming soon.
Here is the training code used in our project.
Setup
To start training with FormaRL, we need to firstly setup both python and lean environments. We will need at least 2xA100-80G GPU to train a large language model with 7B parameters.
Python Environment
Before installing the dependencies you should make sure you have created a new virtual environment for this project with python 3.12. Then you can directly run the following commands to install required packages.
pip install -r requirements.txt
Our training loop is primarily powered by trl and accelerate. You may need to manually run accelerate config to enable more optimizations in distributed training.
Moreover, if you need to use a remote model for consistency check with a openai-compatible API, you will need to create a .env file at the root folder of this project with these required information: