ReXTime: A Benchmark Suite for Reasoning-Across-Time in Videos (NeurIPS 2024 D&B Poster Paper)

ReXTime is designed to test AI models' temporal reasoning within video events, focusing on understanding cause-and-effect across different video segments, with 921 validation samples and 2,143 test samples.

Getting Started

Clone this repo

git clone https://github.com/ReXTime/ReXTime.git
cd ReXTime

Clone dataset from Huggingface

git clone https://huggingface.co/datasets/ReXTime/ReXTime

Source video downloading

ActivityNet

Download the raw video data from the Download page at ActivityNet official website. You need to fill in their request form to have a 7-day-access to download the videos from the drive folders. You can find the form here.

QVHighlights

Download raw video data from the link provided by Moment-DETR. Extract the file.

wget https://nlp.cs.unc.edu/data/jielei/qvh/qvhilights_videos.tar.gz
tar -xvzf qvhilights_videos.tar.gz

Directory structure

.
├── videos/                                     # Path to the QVHighlights raw videos, can be anywhere.
│   ├── 9c_w8HU3hqc_210.0_360.0.mp4             # Video 1
│   └── efCSWDWjm6g_360.0_510.0.mp4             # Video 2
├── Anet_videos_15fps_short256/                 # Path to the ActivityNet raw videos, can be anywhere.
│   ├── v_5R3h6lxne90.mp4                       # Video 1
│   └── v_aQ-F9wr0HQ4.mp4                       # Video 2
├── ReXTime/                                    # Code repo
│   ├── ReXTime/                                # Huggingface dataset repo
│   ├── evaluation/                             # Evaluation code
│   ├── demo/                                   # Inference demo script
│   └── requirements.txt                        # Packages for environment
...

Install dependencies

conda create --name=rextime python=3.10 -y
conda activate rextime
pip install -r requirements.txt

Inference Demo

Here we provide open source model evaluation demo and proprietary models evaluation demo. You need to modify the path to the dataset repo and paths to the directory of two source raw videos in the following scripts. For proprietary models evaluation, you need to fill in your API key.

Open source MLLM demo:

python ./demo/inference.py \
    --dataset_path ./ReXTime \
    --anet_vid_dir ${Path to the AcrivityNet video directory} \
    --qvh_vid_dir ${Paht to the QVHighlights video directory}

Proprietary MLLM demo:

OPENAI_API_KEY="sk-***********************************" python ./demo/request.py \
    --dataset_path ./ReXTime \
    --anet_vid_dir ${Path to the AcrivityNet video directory} \
    --qvh_vid_dir ${Paht to the QVHighlights video directory}

Evaluation

This is an example of output/submission file in .jsonl format. For the assessment of moment grounding, you only need to provide "qid" and "pred_relevant_windows". For the assessment of multi-choice VQA, you only need to provide "qid" and "ans". For the assessment of grounding VQA, you need to provide "qid" "pred_relevant_windows" and "ans" in your submission file. For grounding VQA evaluation, the predicted answer should be conditioned on the predicted time span.

{"qid": "anet_val384", "pred_relevant_windows": [[0.0, 15.8304]], "ans": "A"}
{"qid": "qvh_val114", "pred_relevant_windows": [[0.0, 25.50]], "ans": "A"}
...

Modify the file paths in the following and run:

python ./evaluation/rextime_eval.py \
    --submission_path ${submission_path} \
    --gt_path ${gt_path} \
    --save_path ${save_path}

Here we only provide the ground truth file of validation set in 'data/rextime_val.jsonl'. To access on the test set, please submit the predicted file to ReXTime Leaderboard.

Acknowledgement

The evaluation code is build from Moment-detr.
The inference code is build from Video-LLaVA.

License

The annotation files are under CC BY-NC-SA 4.0 license. All the code are under MIT license, see LICENSE.

Cite

BibTeX:

@article{chen2024rextime,
  title={ReXTime: A Benchmark Suite for Reasoning-Across-Time in Videos},
  author={Chen, Jr-Jen and Liao, Yu-Chien and Lin, Hsi-Che and Yu, Yu-Chu and Chen, Yen-Chun and Wang, Yu-Chiang Frank},
  journal={arXiv preprint arXiv:2406.19392},
  year={2024}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ReXTime: A Benchmark Suite for Reasoning-Across-Time in Videos (NeurIPS 2024 D&B Poster Paper)

Table of Contents

Getting Started

Clone this repo

Clone dataset from Huggingface

Source video downloading

Directory structure

Install dependencies

Inference Demo

Evaluation

Acknowledgement

License

Cite

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
data		data
demo		demo
evaluation		evaluation
images		images
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

License

ReXTime/ReXTime

Folders and files

Latest commit

History

Repository files navigation

ReXTime: A Benchmark Suite for Reasoning-Across-Time in Videos (NeurIPS 2024 D&B Poster Paper)

Table of Contents

Getting Started

Clone this repo

Clone dataset from Huggingface

Source video downloading

Directory structure

Install dependencies

Inference Demo

Evaluation

Acknowledgement

License

Cite

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages