Exporters From Japan
Wholesale exporters from Japan   Company Established 1983
CARVIEW
Select Language

Leaderboard

Comprehensive evaluation results of instruction-guided video editing methods on IVEBench.

More up-to-date instruction-guided video editing methods will continue to be updated.

Leaderboard (Short Subset)

# Method Total Score Video Quality Instruction
Compliance
Video Fidelity Video Quality (Details) Instruction Compliance (Details) Video Fidelity (Details)
Subject
Consistency
Background
Consistency
Temporal
Flickering
Motion
Smoothness
VTSS Overall Semantic
Consistency
Phrase Semantic
Consistency
Instruction
Satisfaction
Quantity
Accuracy
Semantic
Fidelity
Motion
Fidelity
Content
Fidelity
1 Ditto 0.667455 0.781191 0.49081 0.730363 0.962402 0.975734 0.96984 0.989527 0.038201 0.249228 0.243133 3.87 0.3 0.885714 0.789851 3.635
2 InsV2V 0.666815 0.795805 0.3861 0.818541 0.936759 0.958194 0.974885 0.974885 0.044564 0.241119 0.228607 3.0625 0.3 0.950594 0.855546 4.04875
3 Lucy-Edit-Dev 0.635298 0.820931 0.339448 0.745514 0.946271 0.964003 0.97851 0.990135 0.05082 0.238288 0.220275 2.8375 0.2 0.930882 0.677938 3.825
4 VACE 0.626087 0.798298 0.254199 0.825764 0.949597 0.965058 0.975686 0.975686 0.044513 0.23797 0.215348 2.1625 0.2 0.968574 0.885873 4.0325
5 ICVE 0.603252 0.712515 0.453532 0.643709 0.953842 0.97101 0.991063 0.995441 0.017079 0.228634 0.229431 3.6175 0.3 0.848825 0.457221 3.55
6 Omni-Video 0.586517 0.781993 0.43644 0.541117 0.963197 0.971408 0.976379 0.987218 0.038415 0.219778 0.228862 3.36 0.4 0.814609 0.507224 2.845
7 AnyV2V 0.57673 0.727088 0.417245 0.585857 0.892648 0.939805 0.972575 0.972575 0.026466 0.215903 0.236315 3.335 0.3 0.800876 0.815908 2.75
8 StableV2V 0.508923 0.691682 0.426694 0.408393 0.853475 0.919418 0.963825 0.963825 0.018734 0.197835 0.242328 3.56 0.2 0.700009 0.751333 1.7875

Leaderboard (Long Subset)

# Method Total Score Video Quality Instruction
Compliance
Video Fidelity Video Quality (Details) Instruction Compliance (Details) Video Fidelity (Details)
Subject
Consistency
Background
Consistency
Temporal
Flickering
Motion
Smoothness
VTSS Overall Semantic
Consistency
Phrase Semantic
Consistency
Instruction
Satisfaction
Quantity
Accuracy
Semantic
Fidelity
Motion
Fidelity
Content
Fidelity
1 InsV2V 0.65715 0.802357 0.374118 0.794976 0.901442 0.944001 0.975373 0.975373 0.04835 0.240611 0.229095 3.1 0.2 0.952295 0.678833 4.125
2 VACE 0.616088 0.801204 0.267255 0.779804 0.916832 0.94867 0.959168 0.959168 0.048467 0.235583 0.215446 2.27 0.2 0.963778 0.883994 3.735
3 Anyv2v 0.55052 0.724021 0.355533 0.572005 0.836517 0.91594 0.969898 0.969898 0.028747 0.215527 0.228784 3.251852 0 0.796468 0.824666 2.651852
4 StableV2V 0.509333 0.693736 0.420937 0.413327 0.828293 0.905021 0.962683 0.962683 0.02092 0.204132 0.234756 3.44898 0.25 0.70401 0.773338 1.785714
5 Lucy-Edit-Dev 0.648858 0.821 0.315107 0.810466 0.910351 0.94848 0.980079 0.991183 0.052671 0.239159 0.217732 2.645 0.2 0.970202 0.73463 4.13
6 Omni-Video 0.570946 0.778392 0.424405 0.510041 0.94171 0.961136 0.968076 0.980258 0.039098 0.219006 0.229922 3.53 0.2 0.806642 0.551065 2.59
7 ICVE 0.587734 0.719584 0.403837 0.639781 0.953185 0.97228 0.99219 0.995999 0.019113 0.226036 0.228235 3.625 0 0.858698 0.483956 3.475
8 Ditto 0.659281 0.779508 0.478546 0.719789 0.964755 0.976063 0.970801 0.989857 0.037547 0.234154 0.243371 3.925 0.2 0.857041 0.728152 3.685

Note: Higher values indicate better performance for all metrics. Click on column headers to sort by different metrics.

Benchmark

Data Pipeline

data-composition

Data acquisition and processing pipeline of IVEBench. 1) Curation process to 600 high-quality diverse videos. 2) Well-designed pipeline for comprehensive editing prompts.

Benchmark Statistics

data-composition

Statistical distributions of IVEBench

Benchmark Comparison

data-composition

Attributes comparison with open-source video editing benchmarks. Our proposed IVEBench boasts distinct advantages across various key dimensions.

Experiments

Qualitative Visualization

Quantitative Visualization

data-composition

IVEBench Evaluation Results of Video Editing Models. We visualize the evaluation results of four IVE models in 12 IVEBench metrics. We normalize the results per dimension for clearer comparisons.

BibTeX

@article{chen2025ivebench,
  title={IVEBench: Modern Benchmark Suite for Instruction-Guided Video Editing Assessment},
  author={Chen, Yinan and Zhang, Jiangning and Hu, Teng and Zeng, Yuxiang and Xue, Zhucun and He, Qingdong and Wang, Chengjie and Liu, Yong and Hu, Xiaobin and Yan, Shuicheng},
  journal={arXiv preprint arXiv:2510.11647},
  year={2025}
}