HTTP/2 200
server: GitHub.com
content-type: text/html; charset=utf-8
last-modified: Tue, 18 Nov 2025 17:59:27 GMT
access-control-allow-origin: *
strict-transport-security: max-age=31556952
etag: W/"691cb3ff-4fe6"
expires: Mon, 29 Dec 2025 15:40:19 GMT
cache-control: max-age=600
content-encoding: gzip
x-proxy-cache: MISS
x-github-request-id: 4DBC:123DE:9018CA:A1AB9F:69529E84
accept-ranges: bytes
age: 0
date: Mon, 29 Dec 2025 15:30:19 GMT
via: 1.1 varnish
x-served-by: cache-bom-vanm7210021-BOM
x-cache: MISS
x-cache-hits: 0
x-timer: S1767022219.414876,VS0,VE225
vary: Accept-Encoding
x-fastly-request-id: 00949514f7df3117fd2f93e6d8b5c35ab2ec28bd
content-length: 4804
Overview
The ICCV 2025 Workshop on Curated Data for Efficient Learning (CDEL) seeks to advance the understanding and development of data-centric techniques that improve the efficiency of training large-scale machine learning models. As model sizes continue to grow and data requirements scale accordingly, this workshop brings attention to the increasingly critical role of data quality, selection, and synthesis in achieving high model performance with reduced computational cost. Rather than focusing on ever-larger datasets and models, CDEL emphasizes the curation and distillation of high-value data—leveraging techniques such as dataset distillation, data pruning, synthetic data generation, and sampling optimization. These approaches aim to reduce redundancy, improve generalization, and enable learning in data-scarce regimes. The workshop will bring together researchers and practitioners from vision, language, and multimodal learning to share insights and foster collaborations around efficient, scalable, and sustainable data-driven machine learning.
Invited Speakers
Sara Beery
Massachusetts Institute of Technology
10:00 – 10:45 AM
Phillip Isola
Massachusetts Institute of Technology
3:45 – 4:30 PM
Schedule
- Date: October 20, 2025
- Time: 8:30 AM – 5:30 PM
- Location: Room 304-A
Click here for full schedule!
Call for Papers
Archival Submission Portal: OpenReview
Non-Archival Submission Portal: OpenReview
We welcome submissions on all topics related to the curation of training data.
Some potential topics include:
- Data Pruning: How can we eliminate redundant or low-quality samples from large datasets?
- Synthetic Data: How can we use generative models to create or augment datasets?
- Dataset Distillation: How can we learn tiny datasets of highly-efficient synthetic samples?
- Obscure Domains: How can we train models in areas where existing data is extremely scarce?
- Future Directions: What problems in data-centric AI can we expect in the near future?
Submission Details:
We accept submissions of both long conference-style papers (8 pages) and short extended abstracts (4 pages).
Authors of accepted long papers have the option of having their work published in the ICCV workshop proceedings if they do not violate dual-submission guidelines.
We also welcome submissions of work currently in submission or recently accepted to other venues, but these will not be published in the workshop proceedings (but may still be presented at our workshop).
Please sign up here if you’d like to volunteer as a reviewer.
Deadlines
Archival:
- Submission deadline: July 7, 2025
- Notification: July 11, 2025
- Camera-ready: August 18, 2025
Non-Archival:
- Submission deadline: August 29, 2025
- Notification: September 12, 2025
- Camera-ready: September 19, 2025
Please contact George (gcaz@mit.edu) with any questions.
Organizers
Kai Wang
National University of Singapore
Zekai Li
National University of Singapore
Bo Zhao
Shanghai Jiao Tong University