You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
1Snap Research, 2Rice University, 3University of Oxford
📅 Release Schedule
Status
Timeline
Milestone
✅
December 2025
Final review completed
🔄
**TBD (soon) **
Initial release of EgoEditData and EgoEditBench
📖 Overview
We propose a framework for real-time egocentric video editing. Our system is composed of three main components:
EgoEditData: A manually curated dataset of 100k video editing pairs focusing on the egocentric case. It features object substitution and removal under challenging hand occlusions, interactions, and large egomotion.
EgoEdit: The first real-time autoregressive model for egocentric video editing. It runs in real time on a single H100 with 855ms first-frame latency, enabling live augmented reality (AR) interactions.
EgoEditBench: A comprehensive benchmark for the evaluation of egocentric video editing systems.
✨ Features
Real-Time Performance: Designed to run efficiently on modern hardware (single H100) with low latency.
Challenging Scenarios: Handles complex egocentric video challenges such as hand occlusions, object interactions, and significant camera motion.
High Fidelity: Surpasses state of the art models like Editverse in editing faithfulness (via VLM evaluation) and aligns better with human judgment.
📚 Citing
If you find this repository useful, please consider giving a star ⭐ and citation.
@article{li2025egoedit,
title={EgoEdit: Dataset, Real-Time Streaming Model, and Benchmark for Egocentric Video Editing},
author={Li, Runjia and Haji-Ali, Moayed and Mirzaei, Ashkan and Wang, Chaoyang and Sahni, Arpit and Skorokhodov, Ivan and Siarohin, Aliaksandr and Jakab, Tomas and Han, Junlin and Tulyakov, Sergey and others},
journal={arXiv preprint arXiv:2512.06065},
year={2025}
}