You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add pretrain.py, instruction_finetune.py and training_utils.py to icae_v2/, which can be used to train the ICAE.
Version 2 (March 2024)
Two new ICAE models based on Mistral-7B were released. These include a pretrained model an instruction fine-tuned model. The inference code accompanying these models is also provided.
Compared with the V1 released model, the Mistral-7B ICAE models extend support to multi-span concatenation, as illustrated in Figure 6 of the paper. Also, they extend the max input length to 5120.
In the release of V2, I move the dataset and models to my huggingface repo.
One can simply try the released ICAE models by running:
# transformers >= 4.36.2 is requiredcd icae/code/icae_v2
# run the instruction fine-tuned ICAE model
bash fine_tuned_inference_script.sh
# Or run the pretrained ICAE model
bash pretrained_inference_script.sh
Version 1 (September 2023)
This is the original release of the In-context Autoencoder repository. This particular iteration includes the PwC dataset, the code, and a fine-tuned ICAE model based on Llama-2-7b-chat that is used in the paper.
The first version of the ICAE model is based in the Llama-2-7b-chat, which is used in the paper. It can be downloaded from this link. Please use the model with the code available in the code/icae_v1 directory.
Cite Us
If our work contributes to your research, please cite our paper:
@inproceedings{
ge2024incontext,
title={In-context Autoencoder for Context Compression in a Large Language Model},
author={Tao Ge and Hu Jing and Lei Wang and Xun Wang and Si-Qing Chen and Furu Wei},
booktitle={The Twelfth International Conference on Learning Representations},
year={2024},
url={https://openreview.net/forum?id=uREj4ZuGJE}
}