| CARVIEW |
REGen:Multimodal Retrieval-Embedded Generation for Long-to-Short Video Editing
NeurIPS 2025
Weihan Xu 1
Yimeng Ma 1
Jingyue Huang 3
Yang Li 1
Wenye Ma 5
Taylor Berg-Kirkpatrick 3
Julian McAuley 3
Paul Pu Liang 2
Hao-Wen Dong 4
1 Duke University 2 MIT 3 University of California San Diego 4 University of Michigan 5 MBZUAI
Contents:
1) Section 1: Summary of the Compared Models
2) Section 2: Dataset Annotation
3) Section 3: Qualitative Examples
5) Section 5: Zero-shot Examples
Section 1: Summary of the Compared Models
| Model | Narration Type | Script Generation | Retriever |
|---|---|---|---|
| A2Summ | Extraction | – | – |
| Extraction-then-smoothing(ETS) | Extraction | – | – |
| TeaserGen | Abstraction | – | – |
| GPT-4o-DQ | Extraction & Abstraction | Direct Quote | – |
| GPT-4o-SP-DQ | Extraction & Abstraction | Direct Quote(with speaker annotation) | – |
| GPT-4o-SP-TV | Extraction & Abstraction | Indirect Quote(with speaker annotation) | QuoteRetriever-TV |
| GPT-4o-DQ | Extraction & Abstraction | Direct Quote | – |
| REGen-DQ | Extraction & Abstraction | Direct Quote | – |
| REGen-IDQ-T | Extraction & Abstraction | Indirect Quote | QuoteRetriever-T |
| REGen-IDQ-TV | Extraction & Abstraction | Indirect Quote | QuoteRetriever-TV |
Section 2: Dataset Annotation
We annotate the start time, end time, segment type (speaker or quotable interview), and transcribed text for both teasers and documentaries as follows:
Section 3: Qualitative Examples
Video Title: documenta 14 - learning from Athens | DW Documentary
| Input Video | REGen-IDQ-TV | REGen-IDQ-T | REGen-DQ |
Video Title: Apocalypse (Full Episode) | The Story of God with Morgan Freeman
| Input Video | REGen-IDQ-TV | REGen-IDQ-T | REGen-DQ |
Section 4: More Examples
We compare our model performance with TeaserGen, A2Summ and GPT-based models in the following section
Video Title: Is Parkinson's disease related to pesticide use?
| TeaserGen | A2Summ | GPT-4o-SP-DQ |
| REGen-IDQ-TV | REGen-DQ | |
Video Title: documenta 14 - learning from Athens
| TeaserGen | A2Summ | GPT-4o-SP-DQ |
| REGen-IDQ-TV | REGen-DQ | |
Video Title: The dirty business of beauty
| TeaserGen | A2Summ | GPT-4o-SP-DQ |
| REGen-IDQ-TV | REGen-DQ | |
Video Title: Lost at Sea (Full Episode)Extreme Rescues
| TeaserGen | A2Summ | GPT-4o-SP-DQ |
| REGen-IDQ-TV | REGen-DQ | |
Video Title: Saving kids from the Mafia in Italy
| TeaserGen | A2Summ | GPT-4o-SP-DQ |
| REGen-IDQ-TV | REGen-DQ | |
Video Title: "Doctors, apps and artificial intelligence - The future of medicine
| TeaserGen | A2Summ | GPT-4o-SP-DQ |
| REGen-IDQ-TV | REGen-DQ | |
Video Title: The apostle comes from Africa — a contemporary passion story
| TeaserGen | A2Summ | GPT-4o-SP-DQ |
| REGen-IDQ-TV | REGen-DQ | |
Video Title: Love & marriage in Egypt and Taiwan – Whose choice is it?
| TeaserGen | A2Summ | GPT-4o-SP-DQ |
| REGen-IDQ-TV | REGen-DQ | |
Video Title: Archeology – exploring the past with modern technology
| TeaserGen | A2Summ | GPT-4o-SP-DQ |
| REGen-IDQ-TV | REGen-DQ | |
Video Title: Beyond Death (Full Episode) The Story of God with Morgan Freeman
| TeaserGen | A2Summ | GPT-4o-SP-DQ |
| REGen-IDQ-TV | REGen-DQ | |
Section 5: Zero-shot Examples
Zero-shot Examples on Lecture Video and News Videos Teaser Generation Task.
News Videos
NBC Nightly News Full Episode
| Input Video | REGen-IDQ-TV | TeaserGen |
NBC Nightly News Full Episode
| Input Video | REGen-IDQ-TV | TeaserGen |
Lecture Videos
Machine Learning: Coordinated Representations (Multimodal Machine Learning)
| Input Video | REGen-IDQ-TV | TeaserGen |
Psychology: PSY101 Conditioning and Learning
| Input Video | REGen-IDQ-TV | TeaserGen |
Biology: CurrentTopicsLecture3Ch3
| Input Video | REGen-IDQ-TV | TeaserGen |
Biology: Endocrine System - Pituitary Gland THE MASTER GLAND
| Input Video | REGen-IDQ-TV | TeaserGen |
Dentistry: Oral Medicine | ASA Classification | INBDE
| Input Video | REGen-IDQ-TV | TeaserGen |