KVQ: Boosting Video Quality Assessment via Saliency-guided Local Perceptio

¹Tsinghua University, ²Kuaishou Technology.

🚀 Overview framework

Video Quality Assessment (VQA), which intends to predict the perceptual quality of videos, has attracted increasing attention. Due to factors like motion blur or specific distortions, the quality of different regions in a video varies. Recognizing the region-wise local quality within a video is beneficial for assessing global quality and can guide us in adopting fine-grained enhancement or transcoding strategies. Due to the heavy cost of annotating regionwise quality, the lack of ground truth constraints from relevant datasets further complicates the utilization of local perception. Inspired by the Human Visual System (HVS) that links global quality to the local texture of different regions and their visual saliency, we propose a Kaleidoscope Video Quality Assessment (KVQ) framework, which aims to effectively assess both saliency and local texture, thereby facilitating the assessment of global quality. Our framework extracts visual saliency and allocates attention using Fusion-Window Attention (FWA) while incorporating a Local Perception Constraint (LPC) to mitigate the reliance of regional texture perception on neighboring areas. KVQ obtains significant improvements across multiple scenarios on five VQA benchmarks compared to SOTA methods. Furthermore, to assess local perception, we establish a new Local Perception Visual Quality (LPVQ) dataset with region-wise annotations. Experimental results demonstrate the capability of KVQ in perceiving local distortions.

🔥Installation

## git clone this repository
git clone https://github.com/lero233/KVQ.git
cd KVQ
# create an environment with python >= 3.9
conda create -n kvq python=3.9
conda activate kvq
pip install -r requirements.txt

🔥LPVQ dataset

To validate the assessment of local perception, we present the first dataset encompassing local quality annotations, named as Local Perception Visual Quality (LPVQ) dataset. LPVQ comprises a total of 50 images meticulously collected from a typical short-form video platform, showcasing a wide range of scenes and quality factors to ensure representativeness.

We evenly divide each image into non-overlapping 7×7 grids. We assign a subjective quality rating ranging from 1 to 5 points (interval of 0.5) to each patch, involving 14 expert visual researchers for annotation. The LPVQ images are saved in LPVQ/ and their label is saved in labels/LPVQ.txt. You can also get it from .

🔥Inference

Step 1: Prepare testing datasets

Download Corresponding Datasets. LSVQ: Github KoNViD-1k: Official Site LIVE-VQC: Official Site
Change dataset paths and label paths in configs/test.yaml.
Our pretrained weights should be placed in weights/KVQ.pth , which you can get from .

Step 2: Run code

python test.py --config configs/test.yaml

🔥 Train

Step1: Prepare training and testing datasets

Download Corresponding Datasets. LSVQ: Github KoNViD-1k: Official Site LIVE-VQC: Official Site
Change training and testing dataset paths and label paths in configs/kvq.yaml.
Our pretrained weights should be placed in weights/KVQ.pth , which you can get from .

Step2: Prepare pretrained weights

You can use the original Swin-T Weights to initialize the model. Or we suggest you pretrain our KVQ on Kinetics-400 dataset for better results. The pretrained weights should be put into pretrained_weight/

Step2: Run code

CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python -m torch.distributed.launch --nproc_per_node=8 --use_env train.py --config configs/kvq.yaml

You can modify the parameters in configs/kvq.yaml to adapt to your specific need, such as the batch_size and the learning_rate.

Citations

If our work is useful for your research, please consider citing and give us a star ⭐:

@article{qu2025kvq,
  title={KVQ: Boosting Video Quality Assessment via Saliency-guided Local Perception},
  author={Qu, Yunpeng and Yuan, Kun and Xie, Qizhi and Sun, Ming and Zhou, Chao and Jian, Wang},
  journal={arXiv preprint arXiv:2503.10259},
  year={2025}
}

Contact

Please feel free to contact: qyp21@mails.tsinghua.edu.cn. I am very pleased to communicate with you.

Acknowledgments

This project is based on FAST-VQA and some codes are brought from BiFormer. Thanks for their excellent works.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
LPVQ		LPVQ
configs		configs
figure		figure
labels		labels
vqa		vqa
.gitignore		.gitignore
local_perception_data.py		local_perception_data.py
local_perception_test.py		local_perception_test.py
readme.md		readme.md
requirements.txt		requirements.txt
test.py		test.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

KVQ: Boosting Video Quality Assessment via Saliency-guided Local Perceptio

🚀 Overview framework

🔥Installation

🔥LPVQ dataset

🔥Inference

Step 1: Prepare testing datasets

Step 2: Run code

🔥 Train

Step1: Prepare training and testing datasets

Step2: Prepare pretrained weights

Step2: Run code

Citations

Contact

Acknowledgments

About

Uh oh!

Releases

Packages

qyp2000/KVQ

Folders and files

Latest commit

History

Repository files navigation

KVQ: Boosting Video Quality Assessment via Saliency-guided Local Perceptio

🚀 Overview framework

🔥Installation

🔥LPVQ dataset

🔥Inference

Step 1: Prepare testing datasets

Step 2: Run code

🔥 Train

Step1: Prepare training and testing datasets

Step2: Prepare pretrained weights

Step2: Run code

Citations

Contact

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages