You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Rebuilding ROME : Resolving Model Collapse during Model Editing
Changes to the update equation
We focus on the way the MLP keys ($k$ such that $Wk=v$) are computed. See the rome/compute_u.py and rome/compute_v.py scripts for details.
The derived ROME update equation is
We find that the latter leads to rapid degradation in model performance in a sequential editing setting, and prone to particular edits known as disabling edits that render the model unusable post-update. Our experiments focus on unifying the computation of the keys in the update equation, and we study the use of $k$ and $k_*$.
Installation
We recommend using Docker to set up a clean dev environment.
docker compose up -d --build
To download the datasets used for evaluation, install Git LFS if needed:
The script supports sequential editing with the --sequential flag. With sequential editing, the edited model is evaluated for downstream task performance on 4 GLUE datasets after every 20 edits. The interval can be changed within the code-base.
You can evaluate either GPT2-XL or GPTJ-6B using the appropriate hyperparameter file to configure how the update equation is computed.
If you find our work useful, please cite it using the following:
@article{gupta2024rebuilding,
title={Rebuilding ROME: Resolving Model Collapse during Sequential Model Editing},
author={Gupta, Akshat and Anumanchipalli, Gopala},
journal={arXiv preprint arXiv:2403.07175},
year={2024}
}
@article{gupta2024model,
title={Model Editing at Scale leads to Gradual and Catastrophic Forgetting},
author={Gupta, Akshat and Rao, Anurag and Anumanchipalli, Gopala},
journal={arXiv preprint arXiv:2401.07453},
year={2024}
}