Revisiting Weight Averaging for Model Merging

This is the official implementation of CART: Revisiting Weight Averaging for Model Merging.

Abstract

Model merging aims to build a multi-task learner by combining the parameters of individually fine-tuned models without additional training. While a straightforward approach is to average model parameters across tasks, this often results in suboptimal performance due to interference among parameters across tasks. In this paper, we present intriguing results that weight averaging implicitly induces task vectors centered around the weight averaging itself and that applying a low-rank approximation to these centered task vectors significantly improves merging performance. Our analysis shows that centering the task vectors effectively reduces task interference and most of task-specific knowledge is concentrated in the top singular vectors. Our method demonstrates robust and scalable performance on vision benchmarks across varying numbers of tasks and model sizes. Furthermore, we observe that our approach is applicable to natural language processing tasks with competitive performance.

Multi-Task Performances on 8, 14 and 20 vision tasks with merged ViT-B/32 and ViT-L/14.

Method	ViT-B/32			ViT-L/14
	8 tasks	14 tasks	20 tasks	8 tasks	14 tasks	20 tasks
Pretrained	48.0	57.1	56.0	65.0	68.4	65.4
Individual	90.5	89.6	90.5	94.2	93.3	94.1
Without Test Time Adaptation
Weight Averaging	65.9	64.3	61.0	79.6	76.8	71.7
Task Arithmetic	69.1	65.4	60.5	84.5	79.6	74.2
Ties-Merging	72.4	65.2	62.9	86.1	79.5	75.8
Consensus TA	75.2	70.0	65.0	86.6	81.9	78.8
CART (Ours)	84.7	79.5	76.3	92.6	88.7	87.9
With Test Time Adaptation
Adamerging+TA	80.1	76.7	69.2	90.8	88.0	86.8
Adamerging+CART (Ours)	85.8	82.3	82.7	93.1	90.4	91.3

Requirements

Please install the required packages:

pip install -r requirements.txt

Finetuned Weights

Please download the finetuned weights from JH-C-k/ViT-B-32 and place them in /workspace/weight/checkpoints.

Alternatively, you can place the finetuned weights in /workspace/weight/, following the structure from TALL-Masks.

For 8 vision tasks, we used the finetuned models provided in Editing Models with Task Arithmetic.
For addition 12 tasks in 14 and 20 vision tasks, we used the finetuned models from TALL-Masks.
Therefore, the evaluation results may vary if you following the instruction of TALL-Masks.

Datasets

Please download the datasets from JH-C-k/Merge_dataset and place them in /workspace/data/.

Then, follow the instructions below to unzip the datasets:

apt-get install pigz
cat /downloaded_path/* | pigz -p 32 -d -c | tar -xvf - -C /workspace/data/

Since the dataset is approximately 100GB for 20 vision tasks, it is recommended to use hf_transfer to download the dataset. For detailed instructions, refer to the Hugging Face Hub guide.

Alternatively, you can place properly processed datasets in /workspace/data/, following the structure from TALL-Masks.

Note: The data split may differ, as reported in this issue. Therefore, evaluation results may vary.

Run Script

# 8 tasks with CLIP ViT-B/32
python src/merging.py --config_list_path ./configs/CART/ViT-b-32_8_CART.yaml
# 14 tasks with CLIP ViT-B/32
python src/merging.py --config_list_path ./configs/CART/ViT-b-32_14_CART.yaml
# 20 tasks with CLIP ViT-B/32
python src/merging.py --config_list_path ./configs/CART/ViT-b-32_20_CART.yaml

citation

@article{choi2024revisiting,
  title={Revisiting weight averaging for model merging},
  author={Choi, Jiho and Kim, Donggyun and Lee, Chanhyuk and Hong, Seunghoon},
  journal={arXiv preprint arXiv:2412.12153},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
configs/CART		configs/CART
src		src
.gitignore		.gitignore
README.md		README.md
requirement.txt		requirement.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Revisiting Weight Averaging for Model Merging

Abstract

Requirements

Finetuned Weights

Datasets

Run Script

citation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

JH-GEECS/CART

Folders and files

Latest commit

History

Repository files navigation

Revisiting Weight Averaging for Model Merging

Abstract

Requirements

Finetuned Weights

Datasets

Run Script

citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages