You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Oct 5, 2023. It is now read-only.
To play the game locally, it is recommended that you have an nVidia GPU with 12 GB or more of memory, and CUDA installed. If you do not have such a GPU, each turn can take a couple of minutes or more for the game to compose its response. To install and play locally:
git clone --branch master https://github.com/AIDungeon/AIDungeon/
cd AIDungeon
./install.sh # Installs system packages and creates python3 virtual environment
./download_model.sh
source ./venv/bin/activate
./play.py
Finetune the model yourself
Formatting the data. After scraping the data I formatted text adventures into a json dict structure that looked like the following:
{
"tree_id": <someid>
"story_start": <start text of the story>
"action_results": [
{"action":<action1>, "result":<result1>, "action_results": <A Dict that looks like above action results>},
{"action":<action2>, "result":<result2>, "action_results": <A Dict that looks like above action results>}]
}
Essentially it's a tree that captures all the action result nodes.
Then I used this to transform that data into one giant txt file. The txt file looks something like:
<|startoftext|>
You are a survivor living in some place...
> You search for food
You search for food but are unable to find any
> Do another thing
You do another thing...
<|endoftext|>
(above repeated many times)
Then once you have that you can use the finetuning script to fine tune the model provided you have the hardware.
Fine tuning the largest GPT-2 model is difficult due to the immense hardware required. I no longer have access to the same hardware so there are two ways I would suggest doing it. I originally fine tuned the model on 8 32GB V100 GPUs (an Nvidia DGX1). This allowed me to use a batch size of 32 which I found to be helpful in improving quality. The only cloud resource I could find that matches those specs is an aws p3dn.24xlarge instance so you'd want to spin that up on EC2 and fine tune it there. (might have to also request higher limits). Another way you could do it is to use a sagemaker notebook (similar to a colab notebook) and select the p3.24xlarge instance type. This is equivalent to 8 16 GB V100 GPUs. Because each GPU has only 16GB memory you probably need to reduce the batch size to around 8.
Community
AIDungeon is an open source project. Questions, discussion, and
contributions are welcome. Contributions can be anything from new
packages to bugfixes, documentation, or even new core features.
Contributing to AIDungeon is easy! Just send us a
pull request
from your fork. Before you send it, summarize your change in the
[Unreleased] section of the CHANGELOG and make sure
develop is the destination branch.
AIDungeon uses a rough approximation of the
Git Flow
branching model. The develop branch contains the latest
contributions, and master is always tagged and points to the latest
stable release.
If you're a contributor, make sure you're testing and playing on develop.
That's where all the magic is happening (and where we hope bugs stop).