Code for 'Learning Generalized Spatial-Temporal Deep Feature Representation for No-Reference Video Quality Assessment'. The code are mostly based on VSFA.
- Python 3.6.7
- Pytorch 1.6.0 Cuda V9.0.176 Cudnn 7.4.1
-
Download the pre-extracted multi-scale VGG features of each datases from BaiduYun, Extraction code:
gstv
. Then put the features in the path: "./GSTVQA/TCSVT_Release/GVQA_Release/VGG16_mean_std_features/". -
Train:
python ./GSTVQA/TCSVT_Release/GVQA_Release/GVQA_Cross/main.py --TrainIndex=1 (TrainIndex=1:using the CVD2014 datase as source dataset; 2: LIVE-Qua; 3: LIVE-VQC; 4: KoNviD)
-
Test:
python ./GSTVQA/TCSVT_Release/GVQA_Release/GVQA_Cross/cross_test.py --TrainIndex=1 (TrainIndex=1:using the CVD2014 datase as source dataset; 2: LIVE-Qua; 3: LIVE-VQC; 4: KoNviD)
-
The model trained on each above four dataset have been provided in "./GSTVQA/TCSVT_Release/GVQA_Release/GVQA_Cross/models/"