You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We provide the first frame of each test example for inference. Besides, we include the audio pieces of 5 test examples to generate talking lip videos with human voice.
a) Download and decompress the TCD-TIMIT dataset, then put them in the data directory
tar -xvf timit.tar
mv timit data/
b) Run the following scripts to pack the dataset for inference.
the pre-trained model of ParaLip on TCD-TIMIT.
Remember to put the pre-trained models in checkpoints/timit_2 directory respectively.
Citation
@misc{https://doi.org/10.48550/arxiv.2107.06831,
doi = {10.48550/ARXIV.2107.06831},
url = {https://arxiv.org/abs/2107.06831},
author = {Liu, Jinglin and Zhu, Zhiying and Ren, Yi and Huang, Wencan and Huai, Baoxing and Yuan, Nicholas and Zhao, Zhou},
keywords = {Multimedia (cs.MM), Computer Vision and Pattern Recognition (cs.CV), FOS: Computer and information sciences, FOS: Computer and information sciences},
title = {Parallel and High-Fidelity Text-to-Lip Generation},
publisher = {arXiv},
year = {2021},
copyright = {arXiv.org perpetual, non-exclusive license}
}
About
Parallel and High-Fidelity Text-to-Lip Generation; AAAI 2022 ; Official code