You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
To achieve lossless dataset distillation, an intuitive idea is to increase the size of the synthetic dataset.
However, previous dataset distillation methods tend to perform worse than random selection as IPC (i.e., data keep ratio) increases.
To address this issue, we find the difficulty of the generated patterns should be aligned with the size of the synthetic dataset
(avoid generating patterns that are too easy or too difficult).
By doing so, our method remains effective in high IPC cases and achieves lossless dataset distillation for the very first time.
What do easy patterns and hard patterns look like?
News
16 May. The implementation of DATM_with_TESLA is merged. Thanks for the PR from Yue XU!
If you find our code useful for your research, please cite our paper.
@inproceedings{guo2024lossless,
title={Towards Lossless Dataset Distillation via Difficulty-Aligned Trajectory Matching},
author={Ziyao Guo and Kai Wang and George Cazenavette and Hui Li and Kaipeng Zhang and Yang You},
year={2024},
booktitle={The Twelfth International Conference on Learning Representations}
}
About
ICLR 2024, Towards Lossless Dataset Distillation via Difficulty-Aligned Trajectory Matching