Enhancing Small-Scale Dataset Expansion with Triplet-Connection-based Sample Re-Weighting

Xiang, Ting; Chen, Changjian; Tang, Zhuo; Zhang, Qifeng; Lyu, Fei; Yang, Li; Zhang, Jiapeng; Li, Kenli

Computer Science > Computer Vision and Pattern Recognition

arXiv:2508.07723 (cs)

[Submitted on 11 Aug 2025]

Title:Enhancing Small-Scale Dataset Expansion with Triplet-Connection-based Sample Re-Weighting

Authors:Ting Xiang, Changjian Chen, Zhuo Tang, Qifeng Zhang, Fei Lyu, Li Yang, Jiapeng Zhang, Kenli Li

View PDF HTML (experimental)

Abstract:The performance of computer vision models in certain real-world applications, such as medical diagnosis, is often limited by the scarcity of available images. Expanding datasets using pre-trained generative models is an effective solution. However, due to the uncontrollable generation process and the ambiguity of natural language, noisy images may be generated. Re-weighting is an effective way to address this issue by assigning low weights to such noisy images. We first theoretically analyze three types of supervision for the generated images. Based on the theoretical analysis, we develop TriReWeight, a triplet-connection-based sample re-weighting method to enhance generative data augmentation. Theoretically, TriReWeight can be integrated with any generative data augmentation methods and never downgrade their performance. Moreover, its generalization approaches the optimal in the order $O(\sqrt{d\ln (n)/n})$. Our experiments validate the correctness of the theoretical analysis and demonstrate that our method outperforms the existing SOTA methods by $7.9\%$ on average over six natural image datasets and by $3.4\%$ on average over three medical datasets. We also experimentally validate that our method can enhance the performance of different generative data augmentation methods.

Comments:	15 pages, 8 figures, published to ACM MM2025
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2508.07723 [cs.CV]
	(or arXiv:2508.07723v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2508.07723

Submission history

From: Ting Xiang [view email]
[v1] Mon, 11 Aug 2025 07:50:47 UTC (16,876 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Enhancing Small-Scale Dataset Expansion with Triplet-Connection-based Sample Re-Weighting

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Enhancing Small-Scale Dataset Expansion with Triplet-Connection-based Sample Re-Weighting

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators