Patch-VQ: 'Patching Up' the Video Quality Problem

No-reference (NR) perceptual video quality assessment (VQA) is a complex, unsolved, and important problem to social and streaming media applications. Efficient and accurate video quality predictors are needed to monitor and guide the processing of billions of shared, often imperfect, user-generated content (UGC). Unfortunately, current NR models are limited in their prediction capabilities on real-world, "in-the-wild" UGC video data. To advance progress on this problem, we created the largest (by far) subjective video quality dataset, containing 39, 000 realworld distorted videos and 117, 000 space-time localized video patches ('v-patches'), and 5.5M human perceptual quality annotations. Using this, we created two unique NR-VQA models: (a) a local-to-global region-based NR VQA architecture (called PVQ) that learns to predict global video quality and achieves state-of-the-art performance on 3 UGC datasets, and (b) a first-of-a-kind space-time video quality mapping engine (called PVQ Mapper) that helps localize and visualize perceptual distortions in space and time. We will make the new database and prediction models available immediately following the review process.

PDF Abstract CVPR 2021 PDF CVPR 2021 Abstract

Datasets


Introduced in the Paper:

LIVE-FB LSVQ

Used in the Paper:

YFCC100M YouTube-UGC LIVE-VQC KoNViD-1k

Results from the Paper


Ranked #10 on Video Quality Assessment on LIVE-FB LSVQ (using extra training data)

     Get a GitHub badge
Task Dataset Model Metric Name Metric Value Global Rank Uses Extra
Training Data
Benchmark
Video Quality Assessment KoNViD-1k PVQ PLCC 0.770 # 16
Video Quality Assessment LIVE-FB LSVQ PVQ PLCC 0.827 # 10
Video Quality Assessment LIVE-VQC PVQ PLCC 0.791 # 13

Methods


No methods listed for this paper. Add relevant methods here