You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The code is tested with python=3.8 and torch=2.2.0+cu118.
git clone --recurse-submodules https://github.com/TimSong412/OmniTrackFast
cd OmniTrackFast
conda create -n omni python=3.8
conda activate omni
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
pip install -r requirements.txt
Training
Please refer to the preprocessing instructions for preparing input data
for training. We also provide some processed data that you can download, unzip and directly train on.
With processed input data, run the following command to start training:
You can view visualizations on tensorboard by running tensorboard --logdir logs/.
By default, the script trains 100k iterations which takes less than 1 hour on a 2080Ti GPU.
Evaluation
Please download the benchmark annotations used by OmniMotion here and our checkpoints form here.
Please unzip the benchmark directory into the dataset directory as dataset/tapvid_XXXX/annotations
Run the following command to run checkpoint evaluation:
python run_eval.py
Implementation Details
Any video sequence less than 20 frames should be expanded to over 20 or longer, otherwise the NVP blocks are too many to fit the sequence.
We use torch.jit and torch.autograd.Function in the non-linear NVP blocks to manually accelerate the forward and backward process. You may disable the jit configure if any runtime error occurs.
Generally, a video sequence of 100 frames consumes 5GB on a 2080Ti GPU for optimization.
Citation
@article{song2024track,
title={Track Everything Everywhere Fast and Robustly},
author={Song, Yunzhou and Lei, Jiahui and Wang, Ziyun and Liu, Lingjie and Daniilidis, Kostas},
journal={arXiv preprint arXiv:2403.17931},
year={2024}
}
About
Official Code For Track Everything Everywhere Fast and Robustly