Search Results for author: Guangtao Zhai

Found 114 papers, 55 papers with code

Deepfake Generation and Detection: A Benchmark and Survey

1 code implementation26 Mar 2024 Gan Pei, Jiangning Zhang, Menghan Hu, Guangtao Zhai, Chengjie Wang, Zhenyu Zhang, Jian Yang, Chunhua Shen, DaCheng Tao

In addition to the advancements in deepfake generation, corresponding detection technologies need to continuously evolve to regulate the potential misuse of deepfakes, such as for privacy invasion and phishing attacks.

Attribute Face Reenactment +2

Comparison of No-Reference Image Quality Models via MAP Estimation in Diffusion Latents

no code implementations11 Mar 2024 Weixia Zhang, Dingquan Li, Guangtao Zhai, Xiaokang Yang, Kede Ma

Contemporary no-reference image quality assessment (NR-IQA) models can effectively quantify the perceived image quality, with high correlations between model predictions and human perceptual scores on fixed test sets.

Image Enhancement No-Reference Image Quality Assessment +1

Text2QR: Harmonizing Aesthetic Customization and Scanning Robustness for Text-Guided QR Code Generation

1 code implementation11 Mar 2024 Guangyang Wu, Xiaohong Liu, Jun Jia, Xuehao Cui, Guangtao Zhai

This approach harnesses the potent generation capabilities of stable-diffusion models, navigating the trade-off between image aesthetics and QR code scannability.

Code Generation

MISC: Ultra-low Bitrate Image Semantic Compression Driven by Large Multimodal Model

1 code implementation26 Feb 2024 Chunyi Li, Guo Lu, Donghui Feng, HaoNing Wu, ZiCheng Zhang, Xiaohong Liu, Guangtao Zhai, Weisi Lin, Wenjun Zhang

With the evolution of storage and communication protocols, ultra-low bitrate image compression has become a highly demanding topic.

Image Compression

Resolution-Agnostic Neural Compression for High-Fidelity Portrait Video Conferencing via Implicit Radiance Fields

no code implementations26 Feb 2024 Yifei Li, Xiaohong Liu, Yicong Peng, Guangtao Zhai, Jun Zhou

In this paper, we propose a novel low bandwidth neural compression approach for high-fidelity portrait video conferencing using implicit radiance fields to achieve both major objectives.

Video Compression

Towards Open-ended Visual Quality Comparison

no code implementations26 Feb 2024 HaoNing Wu, Hanwei Zhu, ZiCheng Zhang, Erli Zhang, Chaofeng Chen, Liang Liao, Chunyi Li, Annan Wang, Wenxiu Sun, Qiong Yan, Xiaohong Liu, Guangtao Zhai, Shiqi Wang, Weisi Lin

Comparative settings (e. g. pairwise choice, listwise ranking) have been adopted by a wide range of subjective studies for image quality assessment (IQA), as it inherently standardizes the evaluation criteria across different observers and offer more clear-cut responses.

Image Quality Assessment

A Benchmark for Multi-modal Foundation Models on Low-level Vision: from Single Images to Pairs

1 code implementation11 Feb 2024 ZiCheng Zhang, HaoNing Wu, Erli Zhang, Guangtao Zhai, Weisi Lin

To this end, we design benchmark settings to emulate human language responses related to low-level vision: the low-level visual perception (A1) via visual question answering related to low-level attributes (e. g. clarity, lighting); and the low-level visual description (A2), on evaluating MLLMs for low-level text descriptions.

Image Quality Assessment Question Answering +1

Perceptual Video Quality Assessment: A Survey

no code implementations5 Feb 2024 Xiongkuo Min, Huiyu Duan, Wei Sun, Yucheng Zhu, Guangtao Zhai

Perceptual video quality assessment plays a vital role in the field of video processing due to the existence of quality degradations introduced in various stages of video signal acquisition, compression, transmission and display.

Video Quality Assessment

Few-Shot Class-Incremental Learning with Prior Knowledge

1 code implementation2 Feb 2024 Wenhao Jiang, Duo Li, Menghan Hu, Guangtao Zhai, Xiaokang Yang, Xiao-Ping Zhang

To tackle the issues of catastrophic forgetting and overfitting in few-shot class-incremental learning (FSCIL), previous work has primarily concentrated on preserving the memory of old knowledge during the incremental phase.

Few-Shot Class-Incremental Learning Incremental Learning

Uncertainty-aware Sampling for Long-tailed Semi-supervised Learning

1 code implementation9 Jan 2024 Kuo Yang, Duo Li, Menghan Hu, Guangtao Zhai, Xiaokang Yang, Xiao-Ping Zhang

This approach allows the model to perceive the uncertainty of pseudo-labels at different training stages, thereby adaptively adjusting the selection thresholds for different classes.

Pseudo Label

AttentionLut: Attention Fusion-based Canonical Polyadic LUT for Real-time Image Enhancement

no code implementations3 Jan 2024 Kang Fu, Yicong Peng, ZiCheng Zhang, Qihang Xu, Xiaohong Liu, Jia Wang, Guangtao Zhai

Subsequently, the attention fusion module integrates the image feature with the priori attention feature obtained during training to generate image-adaptive canonical polyadic tensors.

Image Enhancement

Q-Refine: A Perceptual Quality Refiner for AI-Generated Image

no code implementations2 Jan 2024 Chunyi Li, HaoNing Wu, ZiCheng Zhang, Hongkun Hao, Kaiwei Zhang, Lei Bai, Xiaohong Liu, Xiongkuo Min, Weisi Lin, Guangtao Zhai

With the rapid evolution of the Text-to-Image (T2I) model in recent years, their unsatisfactory generation result has become a challenge.

Image Quality Assessment

Perceptual Quality Assessment for Video Frame Interpolation

no code implementations25 Dec 2023 Jinliang Han, Xiongkuo Min, Yixuan Gao, Jun Jia, Lei Sun, Zuowei Cao, Yonglin Luo, Guangtao Zhai

To evaluate the quality of VFI frames without reference videos, a no-reference perceptual quality assessment method is proposed in this paper.

Image Quality Assessment Video Frame Interpolation

Exploring the Naturalness of AI-Generated Images

1 code implementation9 Dec 2023 Zijian Chen, Wei Sun, HaoNing Wu, ZiCheng Zhang, Jun Jia, Zhongpeng Ji, Fengyu Sun, Shangling Jui, Xiongkuo Min, Guangtao Zhai, Wenjun Zhang

In this paper, we take the first step to benchmark and assess the visual naturalness of AI-generated images.

SingingHead: A Large-scale 4D Dataset for Singing Head Animation

no code implementations7 Dec 2023 Sijing Wu, Yunhao Li, Weitian Zhang, Jun Jia, Yucheng Zhu, Yichao Yan, Guangtao Zhai

Extensive comparative experiments with both SOTA 3D facial animation and 2D portrait animation methods demonstrate the necessity of singing-specific datasets in singing head animation tasks and the promising performance of our unified facial animation framework.

FS-BAND: A Frequency-Sensitive Banding Detector

no code implementations30 Nov 2023 Zijian Chen, Wei Sun, ZiCheng Zhang, Ru Huang, Fangfang Lu, Xiongkuo Min, Guangtao Zhai, Wenjun Zhang

Banding artifact, as known as staircase-like contour, is a common quality annoyance that happens in compression, transmission, etc.

Image Quality Assessment

Q-Instruct: Improving Low-level Visual Abilities for Multi-modality Foundation Models

1 code implementation12 Nov 2023 HaoNing Wu, ZiCheng Zhang, Erli Zhang, Chaofeng Chen, Liang Liao, Annan Wang, Kaixin Xu, Chunyi Li, Jingwen Hou, Guangtao Zhai, Geng Xue, Wenxiu Sun, Qiong Yan, Weisi Lin

Multi-modality foundation models, as represented by GPT-4V, have brought a new paradigm for low-level visual perception and understanding tasks, that can respond to a broad range of natural human instructions in a model.

Audio-visual Saliency for Omnidirectional Videos

no code implementations9 Nov 2023 Yuxin Zhu, Xilei Zhu, Huiyu Duan, Jie Li, Kaiwei Zhang, Yucheng Zhu, Li Chen, Xiongkuo Min, Guangtao Zhai

Visual saliency prediction for omnidirectional videos (ODVs) has shown great significance and necessity for omnidirectional videos to help ODV coding, ODV transmission, ODV rendering, etc..

Saliency Prediction

A No-Reference Quality Assessment Method for Digital Human Head

no code implementations25 Oct 2023 Yingjie Zhou, ZiCheng Zhang, Wei Sun, Xiongkuo Min, Xianghe Ma, Guangtao Zhai

In this paper, we develop a novel no-reference (NR) method based on Transformer to deal with DHQA in a multi-task manner.

Geometry-Aware Video Quality Assessment for Dynamic Digital Human

no code implementations24 Oct 2023 ZiCheng Zhang, Yingjie Zhou, Wei Sun, Xiongkuo Min, Guangtao Zhai

Usually, DDHs are displayed as 2D rendered animation videos and it is natural to adapt video quality assessment (VQA) methods to DDH quality assessment (DDH-QA) tasks.

Attribute Video Quality Assessment +1

Q-Bench: A Benchmark for General-Purpose Foundation Models on Low-level Vision

1 code implementation25 Sep 2023 HaoNing Wu, ZiCheng Zhang, Erli Zhang, Chaofeng Chen, Liang Liao, Annan Wang, Chunyi Li, Wenxiu Sun, Qiong Yan, Guangtao Zhai, Weisi Lin

To address this gap, we present Q-Bench, a holistic benchmark crafted to systematically evaluate potential abilities of MLLMs on three realms: low-level visual perception, low-level visual description, and overall visual quality assessment.

Image Quality Assessment

Joint Gaze-Location and Gaze-Object Detection

no code implementations26 Aug 2023 Danyang Tu, Wei Shen, Wei Sun, Xiongkuo Min, Guangtao Zhai

In contrast, we reframe the gaze following detection task as detecting human head locations and their gaze followings simultaneously, aiming at jointly detect human gaze location and gaze object in a unified and single-stage pipeline.

Object object-detection +1

AccFlow: Backward Accumulation for Long-Range Optical Flow

1 code implementation ICCV 2023 Guangyang Wu, Xiaohong Liu, Kunming Luo, Xi Liu, Qingqing Zheng, Shuaicheng Liu, Xinyang Jiang, Guangtao Zhai, Wenyi Wang

To train and evaluate the proposed AccFlow, we have constructed a large-scale high-quality dataset named CVO, which provides ground-truth optical flow labels between adjacent and distant frames.

Optical Flow Estimation

Agglomerative Transformer for Human-Object Interaction Detection

no code implementations ICCV 2023 Danyang Tu, Wei Sun, Guangtao Zhai, Wei Shen

We propose an agglomerative Transformer (AGER) that enables Transformer-based human-object interaction (HOI) detectors to flexibly exploit extra instance-level cues in a single-stage and end-to-end manner for the first time.

Clustering Human-Object Interaction Detection +1

StableVQA: A Deep No-Reference Quality Assessment Model for Video Stability

1 code implementation9 Aug 2023 Tengchuan Kou, Xiaohong Liu, Wei Sun, Jun Jia, Xiongkuo Min, Guangtao Zhai, Ning Liu

Indeed, most existing quality assessment models evaluate video quality as a whole without specifically taking the subjective experience of video stability into consideration.

Video Quality Assessment Video Stabilization +1

RAWIW: RAW Image Watermarking Robust to ISP Pipeline

no code implementations28 Jul 2023 Kang Fu, Xiaohong Liu, Jun Jia, ZiCheng Zhang, Yicong Peng, Jia Wang, Guangtao Zhai

To achieve end-to-end training of the framework, we integrate a neural network that simulates the ISP pipeline to handle the RAW-to-RGB conversion process.

Analysis of Video Quality Datasets via Design of Minimalistic Video Quality Models

no code implementations26 Jul 2023 Wei Sun, Wen Wen, Xiongkuo Min, Long Lan, Guangtao Zhai, Kede Ma

By minimalistic, we restrict our family of BVQA models to build only upon basic blocks: a video preprocessor (for aggressive spatiotemporal downsampling), a spatial quality analyzer, an optional temporal quality analyzer, and a quality regressor, all with the simplest possible instantiations.

Blind Image Quality Assessment Video Quality Assessment +1

Perceptual Quality Assessment of Omnidirectional Audio-visual Signals

1 code implementation20 Jul 2023 Xilei Zhu, Huiyu Duan, Yuqin Cao, Yuxin Zhu, Yucheng Zhu, Jing Liu, Li Chen, Xiongkuo Min, Guangtao Zhai

Omnidirectional videos (ODVs) play an increasingly important role in the application fields of medical, education, advertising, tourism, etc.

Blind Image Quality Assessment: A Fuzzy Neural Network for Opinion Score Distribution Prediction

1 code implementation IEEE Transactions on Circuits and Systems for Video Technology 2023 Yixuan Gao, Xiongkuo Min, Yucheng Zhu, Xiao-Ping Zhang, Guangtao Zhai

On the other hand, we also prove the feasibility of the proposed method in predicting the MOS of image quality on several popular IQA databases, including CSIQ, TID2013, LIVE MD, and LIVE Challenge.

Blind Image Quality Assessment

Subjective and Objective Audio-Visual Quality Assessment for User Generated Content

1 code implementation IEEE Transactions on Image Processing 2023 Yuqin Cao, Xiongkuo Min, Wei Sun, Guangtao Zhai

Then, to facilitate the development of AVQA fields, we construct a benchmark of AVQA models on the proposed SJTU-UAV database and other two AVQA databases, of which the benchmark models consist of AVQA models designed for synthetically distorted A/V sequences and AVQA models built through combining the popular VQA methods and audio features via support vector regressor (SVR).

Video Quality Assessment Visual Question Answering (VQA)

Advancing Zero-Shot Digital Human Quality Assessment through Text-Prompted Evaluation

1 code implementation6 Jul 2023 ZiCheng Zhang, Wei Sun, Yingjie Zhou, HaoNing Wu, Chunyi Li, Xiongkuo Min, Xiaohong Liu, Guangtao Zhai, Weisi Lin

To address this gap, we propose SJTU-H3D, a subjective quality assessment database specifically designed for full-body digital humans.

AIGCIQA2023: A Large-scale Image Quality Assessment Database for AI Generated Images: from the Perspectives of Quality, Authenticity and Correspondence

1 code implementation1 Jul 2023 Jiarui Wang, Huiyu Duan, Jing Liu, Shi Chen, Xiongkuo Min, Guangtao Zhai

In this paper, in order to get a better understanding of the human visual preferences for AIGIs, a large-scale IQA database for AIGC is established, which is named as AIGCIQA2023.

Image Quality Assessment Text-to-Image Generation

GMS-3DQA: Projection-based Grid Mini-patch Sampling for 3D Model Quality Assessment

1 code implementation9 Jun 2023 ZiCheng Zhang, Wei Sun, Houning Wu, Yingjie Zhou, Chunyi Li, Xiongkuo Min, Guangtao Zhai, Weisi Lin

Model-based 3DQA methods extract features directly from the 3D models, which are characterized by their high degree of complexity.

Point Cloud Quality Assessment

AGIQA-3K: An Open Database for AI-Generated Image Quality Assessment

1 code implementation7 Jun 2023 Chunyi Li, ZiCheng Zhang, HaoNing Wu, Wei Sun, Xiongkuo Min, Xiaohong Liu, Guangtao Zhai, Weisi Lin

With the rapid advancements of the text-to-image generative model, AI-generated images (AGIs) have been widely applied to entertainment, education, social media, etc.

Image Quality Assessment

Light-VQA: A Multi-Dimensional Quality Assessment Model for Low-Light Video Enhancement

1 code implementation16 May 2023 Yunlong Dong, Xiaohong Liu, Yixuan Gao, Xunchu Zhou, Tao Tan, Guangtao Zhai

To this end, we first construct a Low-Light Video Enhancement Quality Assessment (LLVE-QA) dataset in which 254 original low-light videos are collected and then enhanced by leveraging 8 LLVE algorithms to obtain 2, 060 videos in total.

Video Enhancement Video Quality Assessment +1

GANHead: Towards Generative Animatable Neural Head Avatars

no code implementations CVPR 2023 Sijing Wu, Yichao Yan, Yunhao Li, Yuhao Cheng, Wenhan Zhu, Ke Gao, Xiaobo Li, Guangtao Zhai

To bring digital avatars into people's lives, it is highly demanded to efficiently generate complete, realistic, and animatable head avatars.

Masked Autoencoders as Image Processors

1 code implementation30 Mar 2023 Huiyu Duan, Wei Shen, Xiongkuo Min, Danyang Tu, Long Teng, Jia Wang, Guangtao Zhai

Recently, masked autoencoders (MAE) for feature pre-training have further unleashed the potential of Transformers, leading to state-of-the-art performances on various high-level vision tasks.

Deblurring Image Defocus Deblurring +2

MD-VQA: Multi-Dimensional Quality Assessment for UGC Live Videos

1 code implementation CVPR 2023 ZiCheng Zhang, Wei Wu, Wei Sun, Dangyang Tu, Wei Lu, Xiongkuo Min, Ying Chen, Guangtao Zhai

User-generated content (UGC) live videos are often bothered by various distortions during capture procedures and thus exhibit diverse visual qualities.

Video Quality Assessment Visual Question Answering (VQA)

A Perceptual Quality Assessment Exploration for AIGC Images

1 code implementation22 Mar 2023 ZiCheng Zhang, Chunyi Li, Wei Sun, Xiaohong Liu, Xiongkuo Min, Guangtao Zhai

\underline{AI} \underline{G}enerated \underline{C}ontent (\textbf{AIGC}) has gained widespread attention with the increasing efficiency of deep learning in content creation.

Image Quality Assessment

VDPVE: VQA Dataset for Perceptual Video Enhancement

1 code implementation16 Mar 2023 Yixuan Gao, Yuqin Cao, Tengchuan Kou, Wei Sun, Yunlong Dong, Xiaohong Liu, Xiongkuo Min, Guangtao Zhai

Few researchers have specifically proposed a video quality assessment method for video enhancement, and there is also no comprehensive video quality assessment dataset available in public.

Deblurring valid +3

Subjective and Objective Quality Assessment for in-the-Wild Computer Graphics Images

1 code implementation14 Mar 2023 ZiCheng Zhang, Wei Sun, Yingjie Zhou, Jun Jia, Zhichao Zhang, Jing Liu, Xiongkuo Min, Guangtao Zhai

Computer graphics images (CGIs) are artificially generated by means of computer programs and are widely perceived under various scenarios, such as games, streaming media, etc.

Image Quality Assessment NR-IQA

Audio-Visual Quality Assessment for User Generated Content: Database and Method

no code implementations4 Mar 2023 Yuqin Cao, Xiongkuo Min, Wei Sun, XiaoPing Zhang, Guangtao Zhai

Specifically, we construct the first UGC AVQA database named the SJTU-UAV database, which includes 520 in-the-wild UGC audio and video (A/V) sequences, and conduct a user study to obtain the mean opinion scores of the A/V sequences.

Video Quality Assessment Visual Question Answering (VQA)

EEP-3DQA: Efficient and Effective Projection-based 3D Model Quality Assessment

no code implementations17 Feb 2023 ZiCheng Zhang, Wei Sun, Yingjie Zhou, Wei Lu, Yucheng Zhu, Xiongkuo Min, Guangtao Zhai

Currently, great numbers of efforts have been put into improving the effectiveness of 3D model quality assessment (3DQA) methods.

Non-Semantics Suppressed Mask Learning for Unsupervised Video Semantic Compression

no code implementations ICCV 2023 Yuan Tian, Guo Lu, Guangtao Zhai, Zhiyong Gao

Most video compression methods aim to improve the decoded video visual quality, instead of particularly guaranteeing the semantic-completeness, which deteriorates downstream video analysis tasks, e. g., action recognition.

Action Recognition Video Compression

DDH-QA: A Dynamic Digital Humans Quality Assessment Database

1 code implementation24 Dec 2022 ZiCheng Zhang, Yingjie Zhou, Wei Sun, Wei Lu, Xiongkuo Min, Yu Wang, Guangtao Zhai

In recent years, large amounts of effort have been put into pushing forward the real-world application of dynamic digital human (DDH).

Video Quality Assessment

Skeleton2Humanoid: Animating Simulated Characters for Physically-plausible Motion In-betweening

1 code implementation9 Oct 2022 Yunhao Li, Zhenbo Yu, Yucheng Zhu, Bingbing Ni, Guangtao Zhai, Wei Shen

Stage I introduces a test time adaptation strategy, which improves the physical plausibility of synthesized human skeleton motions by optimizing skeleton joint locations.

Motion Synthesis Reinforcement Learning (RL) +1

Perceptual Attacks of No-Reference Image Quality Models with Human-in-the-Loop

1 code implementation3 Oct 2022 Weixia Zhang, Dingquan Li, Xiongkuo Min, Guangtao Zhai, Guodong Guo, Xiaokang Yang, Kede Ma

No-reference image quality assessment (NR-IQA) aims to quantify how humans perceive visual distortions of digital images without access to their undistorted references.

No-Reference Image Quality Assessment NR-IQA

Perceptual Quality Assessment for Digital Human Heads

1 code implementation20 Sep 2022 ZiCheng Zhang, Yingjie Zhou, Wei Sun, Xiongkuo Min, Yuzhe Wu, Guangtao Zhai

Digital humans are attracting more and more research interest during the last decade, the generation, representation, rendering, and animation of which have been put into large amounts of effort.

MM-PCQA: Multi-Modal Learning for No-reference Point Cloud Quality Assessment

1 code implementation1 Sep 2022 ZiCheng Zhang, Wei Sun, Xiongkuo Min, Quan Zhou, Jun He, Qiyuan Wang, Guangtao Zhai

In specific, we split the point clouds into sub-models to represent local geometry distortions such as point shift and down-sampling.

Point Cloud Quality Assessment

Blind Quality Assessment of 3D Dense Point Clouds with Structure Guided Resampling

no code implementations31 Aug 2022 Wei Zhou, Qi Yang, Qiuping Jiang, Guangtao Zhai, Weisi Lin

Objective quality assessment of 3D point clouds is essential for the development of immersive multimedia systems in real-world applications.

Evaluating Point Cloud from Moving Camera Videos: A No-Reference Metric

1 code implementation30 Aug 2022 ZiCheng Zhang, Wei Sun, Yucheng Zhu, Xiongkuo Min, Wei Wu, Ying Chen, Guangtao Zhai

To tackle the challenge of point cloud quality assessment (PCQA), many PCQA methods have been proposed to evaluate the visual quality levels of point clouds by assessing the rendered static 2D projections.

Image Quality Assessment Point Cloud Quality Assessment +2

Perceptual Quality Assessment of Omnidirectional Images

no code implementations6 Jul 2022 Huiyu Duan, Guangtao Zhai, Xiongkuo Min, Yucheng Zhu, Yi Fang, Xiaokang Yang

The original and distorted omnidirectional images, subjective quality ratings, and the head and eye movement data together constitute the OIQA database.

Image Quality Assessment

Enhanced Deep Animation Video Interpolation

2 code implementations25 Jun 2022 Wang Shen, Cheng Ming, Wenbo Bao, Guangtao Zhai, Li Chen, Zhiyong Gao

With AutoFI and SktFI, the interpolated animation frames show high perceptual quality.

Subjective Quality Assessment for Images Generated by Computer Graphics

no code implementations10 Jun 2022 Tao Wang, ZiCheng Zhang, Wei Sun, Xiongkuo Min, Wei Lu, Guangtao Zhai

However, limited work has been put forward to tackle the problem of computer graphics generated images' quality assessment (CG-IQA).

No-Reference Image Quality Assessment NR-IQA

A No-reference Quality Assessment Metric for Point Cloud Based on Captured Video Sequences

no code implementations9 Jun 2022 Yu Fan, ZiCheng Zhang, Wei Sun, Xiongkuo Min, Wei Lu, Tao Wang, Ning Liu, Guangtao Zhai

Point cloud is one of the most widely used digital formats of 3D models, the visual quality of which is quite sensitive to distortions such as downsampling, noise, and compression.

Point Cloud Quality Assessment

Deep Neural Network for Blind Visual Quality Assessment of 4K Content

no code implementations9 Jun 2022 Wei Lu, Wei Sun, Xiongkuo Min, Wenhan Zhu, Quan Zhou, Jun He, Qiyuan Wang, ZiCheng Zhang, Tao Wang, Guangtao Zhai

In this paper, we propose a deep learning-based BIQA model for 4K content, which on one hand can recognize true and pseudo 4K content and on the other hand can evaluate their perceptual visual quality.

Blind Image Quality Assessment Multi-Task Learning

A No-Reference Deep Learning Quality Assessment Method for Super-resolution Images Based on Frequency Maps

no code implementations9 Jun 2022 ZiCheng Zhang, Wei Sun, Xiongkuo Min, Wenhan Zhu, Tao Wang, Wei Lu, Guangtao Zhai

Therefore, in this paper, we propose a no-reference deep-learning image quality assessment method based on frequency maps because the artifacts caused by SISR algorithms are quite sensitive to frequency information.

Image Quality Assessment Image Super-Resolution

Blind Surveillance Image Quality Assessment via Deep Neural Network Combined with the Visual Saliency

no code implementations9 Jun 2022 Wei Lu, Wei Sun, Wenhan Zhu, Xiongkuo Min, ZiCheng Zhang, Tao Wang, Guangtao Zhai

In this paper, we first conduct an example experiment (i. e. the face detection task) to demonstrate that the quality of the SIs has a crucial impact on the performance of the IVSS, and then propose a saliency-based deep neural network for the blind quality assessment of the SIs, which helps IVSS to filter the low-quality SIs and improve the detection and recognition performance.

Face Detection Image Quality Assessment

Perceptual Quality Assessment for Fine-Grained Compressed Images

no code implementations8 Jun 2022 ZiCheng Zhang, Wei Sun, Wei Wu, Ying Chen, Xiongkuo Min, Guangtao Zhai

Nowadays, the mainstream full-reference (FR) metrics are effective to predict the quality of compressed images at coarse-grained levels (the bit rates differences of compressed images are obvious), however, they may perform poorly for fine-grained compressed images whose bit rates differences are quite subtle.

Image Compression Image Quality Assessment

Video-based Human-Object Interaction Detection from Tubelet Tokens

no code implementations4 Jun 2022 Danyang Tu, Wei Sun, Xiongkuo Min, Guangtao Zhai, Wei Shen

We present a novel vision Transformer, named TUTOR, which is able to learn tubelet tokens, served as highly-abstracted spatiotemporal representations, for video-based human-object interaction (V-HOI) detection.

Human-Object Interaction Detection

Deep Decomposition and Bilinear Pooling Network for Blind Night-Time Image Quality Evaluation

no code implementations12 May 2022 Qiuping Jiang, Jiawu Xu, Yudong Mao, Wei Zhou, Xiongkuo Min, Guangtao Zhai

The DDB-Net contains three modules, i. e., an image decomposition module, a feature encoding module, and a bilinear pooling module.

Blind Image Quality Assessment

A Deep Learning based No-reference Quality Assessment Model for UGC Videos

1 code implementation29 Apr 2022 Wei Sun, Xiongkuo Min, Wei Lu, Guangtao Zhai

The proposed model utilizes very sparse frames to extract spatial features and dense frames (i. e. the video chunk) with a very low spatial resolution to extract motion features, which thereby has low computational complexity.

Image Quality Assessment Video Quality Assessment

Saliency in Augmented Reality

1 code implementation18 Apr 2022 Huiyu Duan, Wei Shen, Xiongkuo Min, Danyang Tu, Jing Li, Guangtao Zhai

Therefore, in this paper, we mainly analyze the interaction effect between background (BG) scenes and AR contents, and study the saliency prediction problem in AR.

Saliency Prediction

Iwin: Human-Object Interaction Detection via Transformer with Irregular Windows

no code implementations20 Mar 2022 Danyang Tu, Xiongkuo Min, Huiyu Duan, Guodong Guo, Guangtao Zhai, Wei Shen

Iwin Transformer is a hierarchical Transformer which progressively performs token representation learning and token agglomeration within irregular windows.

Human-Object Interaction Detection Object +4

Parameterized Image Quality Score Distribution Prediction

no code implementations2 Mar 2022 Yixuan Gao, Xiongkuo Min, Wenhan Zhu, Xiao-Ping Zhang, Guangtao Zhai

Experimental results verifythe feasibility of using alpha stable model to describe the IQSD, and prove the effectiveness of objective alpha stable model basedIQSD prediction method.

valid

A Coding Framework and Benchmark towards Compressed Video Understanding

no code implementations6 Feb 2022 Yuan Tian, Guo Lu, Yichao Yan, Guangtao Zhai, Li Chen, Zhiyong Gao

However, in real-world scenarios, the videos are first compressed before the transportation and then decompressed for understanding.

Video Understanding

DFA-NeRF: Personalized Talking Head Generation via Disentangled Face Attributes Neural Rendering

no code implementations3 Jan 2022 Shunyu Yao, RuiZhe Zhong, Yichao Yan, Guangtao Zhai, Xiaokang Yang

Specifically, neural radiance field takes lip movements features and personalized attributes as two disentangled conditions, where lip movements are directly predicted from the audio inputs to achieve lip-synchronized generation.

Neural Rendering Talking Head Generation

Learning Invisible Markers for Hidden Codes in Offline-to-Online Photography

no code implementations CVPR 2022 Jun Jia, Zhongpai Gao, Dandan Zhu, Xiongkuo Min, Guangtao Zhai, Xiaokang Yang

In addition, the automatic localization of hidden codes significantly reduces the time of manually correcting geometric distortions for photos, which is a revolutionary innovation for information hiding in mobile applications.

Frequency-Aware Physics-Inspired Degradation Model for Real-World Image Super-Resolution

no code implementations5 Nov 2021 Zhenxing Dong, Hong Cao, Wang Shen, Yu Gan, Yuye Ling, Guangtao Zhai, Yikai Su

In particular, we propose to use a convolutional neural network (CNN) to learn the cutoff frequency of real-world degradation process.

Image Super-Resolution

Unfolding Projection-free SDP Relaxation of Binary Graph Classifier via GDPA Linearization

no code implementations10 Sep 2021 Cheng Yang, Gene Cheung, Wai-tian Tan, Guangtao Zhai

Algorithm unfolding creates an interpretable and parsimonious neural network architecture by implementing each iteration of a model-based algorithm as a neural layer.

Blindly Assess Quality of In-the-Wild Videos via Quality-aware Pre-training and Motion Perception

1 code implementation19 Aug 2021 Bowen Li, Weixia Zhang, Meng Tian, Guangtao Zhai, Xianpei Wang

The inaccessibility of reference videos with pristine quality and the complexity of authentic distortions pose great challenges for this kind of blind video quality assessment (BVQA) task.

Action Recognition Image Quality Assessment +3

Task-Specific Normalization for Continual Learning of Blind Image Quality Models

1 code implementation28 Jul 2021 Weixia Zhang, Kede Ma, Guangtao Zhai, Xiaokang Yang

In this paper, we present a simple yet effective continual learning method for blind image quality assessment (BIQA) with improved quality prediction accuracy, plasticity-stability trade-off, and task-order/-length robustness.

Blind Image Quality Assessment Continual Learning

Self-Conditioned Probabilistic Learning of Video Rescaling

1 code implementation ICCV 2021 Yuan Tian, Guo Lu, Xiongkuo Min, Zhaohui Che, Guangtao Zhai, Guodong Guo, Zhiyong Gao

After optimization, the downscaled video by our framework preserves more meaningful information, which is beneficial for both the upscaling step and the downstream tasks, e. g., video action recognition task.

Video Compression Video Super-Resolution

EAN: Event Adaptive Network for Enhanced Action Recognition

1 code implementation22 Jul 2021 Yuan Tian, Yichao Yan, Guangtao Zhai, Guodong Guo, Zhiyong Gao

In this paper, we propose a unified action recognition framework to investigate the dynamic nature of video content by introducing the following designs.

Action Recognition

No-Reference Quality Assessment for 3D Colored Point Cloud and Mesh Models

2 code implementations5 Jul 2021 ZiCheng Zhang, Wei Sun, Xiongkuo Min, Tao Wang, Wei Lu, Guangtao Zhai

Therefore, many related studies such as point cloud quality assessment (PCQA) and mesh quality assessment (MQA) have been carried out to measure the visual quality degradations of 3D models.

Point Cloud Quality Assessment

Dual Attention Guided Gaze Target Detection in the Wild

1 code implementation CVPR 2021 Yi Fang, Jiapeng Tang, Wang Shen, Wei Shen, Xiao Gu, Li Song, Guangtao Zhai

In the third stage, we use the generated dual attention as guidance to perform two sub-tasks: (1) identifying whether the gaze target is inside or out of the image; (2) locating the target if inside.

Projection-free Graph-based Classifier Learning using Gershgorin Disc Perfect Alignment

no code implementations NeurIPS 2021 Cheng Yang, Gene Cheung, Guangtao Zhai

We repose the SDR dual for solution $\bar{\mathbf{H}}$, then replace the PSD cone constraint $\bar{\mathbf{H}} \succeq 0$ with linear constraints derived from GDPA -- sufficient conditions to ensure $\bar{\mathbf{H}}$ is PSD -- so that the optimization becomes an LP per iteration.

Deep Learning based Full-reference and No-reference Quality Assessment Models for Compressed UGC Videos

1 code implementation2 Jun 2021 Wei Sun, Tao Wang, Xiongkuo Min, Fuwang Yi, Guangtao Zhai

The proposed VQA framework consists of three modules, the feature extraction module, the quality regression module, and the quality pooling module.

regression Video Quality Assessment

Prediction-assistant Frame Super-Resolution for Video Streaming

no code implementations17 Mar 2021 Wang Shen, Wenbo Bao, Guangtao Zhai, Charlie L Wang, Jerry W Hu, Zhiyong Gao

An effective approach is to transmit frames in lower-quality under poor bandwidth conditions, such as using scalable video coding.

Super-Resolution Video Enhancement +1

Continual Learning for Blind Image Quality Assessment

1 code implementation19 Feb 2021 Weixia Zhang, Dingquan Li, Chao Ma, Guangtao Zhai, Xiaokang Yang, Kede Ma

In this paper, we formulate continual learning for BIQA, where a model learns continually from a stream of IQA datasets, building on what was learned from previously seen data.

Blind Image Quality Assessment Continual Learning

Looking Here or There? Gaze Following in 360-Degree Images

1 code implementation ICCV 2021 Yunhao Li, Wei Shen, Zhongpai Gao, Yucheng Zhu, Guangtao Zhai, Guodong Guo

Specifically, the local region is obtained as a 2D cone-shaped field along the 2D projection of the sight line starting at the human subject's head position, and the distant region is obtained by searching along the sight line in 3D sphere space.

Identification of deep breath while moving forward based on multiple body regions and graph signal analysis

no code implementations20 Oct 2020 Yunlu Wang, Cheng Yang, Menghan Hu, Jian Zhang, Qingli Li, Guangtao Zhai, Xiao-Ping Zhang

This paper presents an unobtrusive solution that can automatically identify deep breath when a person is walking past the global depth camera.

Face Mask Assistant: Detection of Face Mask Service Stage Based on Mobile Phone

no code implementations9 Oct 2020 Yuzhen Chen, Menghan Hu, Chunjun Hua, Guangtao Zhai, Jian Zhang, Qingli Li, Simon X. Yang

Aimed at solving the problem that we don't know which service stage of the mask belongs to, we propose a detection system based on the mobile phone.

Perceptron Synthesis Network: Rethinking the Action Scale Variances in Videos

no code implementations22 Jul 2020 Yuan Tian, Guangtao Zhai, Zhiyong Gao

More specifically, an \textit{action perceptron synthesizer} is proposed to generate the kernels from a bag of fixed-size kernels that are interacted by dense routing paths.

Action Recognition Temporal Action Localization +1

Uncertainty-Aware Blind Image Quality Assessment in the Laboratory and Wild

1 code implementation28 May 2020 Weixia Zhang, Kede Ma, Guangtao Zhai, Xiaokang Yang

Nevertheless, due to the distributional shift between images simulated in the laboratory and captured in the wild, models trained on databases with synthetic distortions remain particularly weak at handling realistic distortions (and vice versa).

Blind Image Quality Assessment Learning-To-Rank

Permutation Matters: Anisotropic Convolutional Layer for Learning on Point Clouds

1 code implementation27 May 2020 Zhongpai Gao, Guangtao Zhai, Junchi Yan, Xiaokang Yang

Various point neural networks have been developed with isotropic filters or using weighting matrices to overcome the structure inconsistency on point clouds.

Representation Learning Semantic Segmentation

Combining Visible Light and Infrared Imaging for Efficient Detection of Respiratory Infections such as COVID-19 on Portable Device

no code implementations15 Apr 2020 Zheng Jiang, Menghan Hu, Lei Fan, Yaling Pan, Wei Tang, Guangtao Zhai, Yong Lu

In this work, we perform the health screening through the combination of the RGB and thermal videos obtained from the dual-mode camera and deep learning architecture. We first accomplish a respiratory data capture technique for people wearing masks by using face recognition.

Face Recognition

Blurry Video Frame Interpolation

1 code implementation CVPR 2020 Wang Shen, Wenbo Bao, Guangtao Zhai, Li Chen, Xiongkuo Min, Zhiyong Gao

Existing works reduce motion blur and up-convert frame rate through two separate ways, including frame deblurring and frame interpolation.

Deblurring Video Enhancement +1

Abnormal respiratory patterns classifier may contribute to large-scale screening of people infected with COVID-19 in an accurate and unobtrusive manner

no code implementations12 Feb 2020 Yunlu Wang, Menghan Hu, Qingli Li, Xiao-Ping Zhang, Guangtao Zhai, Nan Yao

During the epidemic prevention and control period, our study can be helpful in prognosis, diagnosis and screening for the patients infected with COVID-19 (the novel coronavirus) based on breathing characteristics.

Toward Better Understanding of Saliency Prediction in Augmented 360 Degree Videos

no code implementations12 Dec 2019 Yucheng Zhu, Xiongkuo Min, Dandan Zhu, Ke Gu, Jiantao Zhou, Guangtao Zhai, Xiaokang Yang, Wenjun Zhang

The saliency annotations of head and eye movements for both original and augmented videos are collected and together constitute the ARVR dataset.

Object Recognition Optical Flow Estimation +1

Robust Invisible Hyperlinks in Physical Photographs Based on 3D Rendering Attacks

no code implementations3 Dec 2019 Jun Jia, Zhongpai Gao, Kang Chen, Menghan Hu, Guangtao Zhai, Guodong Guo, Xiaokang Yang

To train a robust decoder against the physical distortion from the real world, a distortion network based on 3D rendering is inserted between the encoder and the decoder to simulate the camera imaging process.

Semi-supervised 3D Face Reconstruction with Nonlinear Disentangled Representations

no code implementations25 Sep 2019 Zhongpai Gao, Juyong Zhang, Yudong Guo, Chao Ma, Guangtao Zhai, Xiaokang Yang

Moreover, the identity and expression representations are entangled in these models, which hurdles many facial editing applications.

3D Face Reconstruction Facial Editing

Learning to Blindly Assess Image Quality in the Laboratory and Wild

1 code implementation1 Jul 2019 Weixia Zhang, Kede Ma, Guangtao Zhai, Xiaokang Yang

Computational models for blind image quality assessment (BIQA) are typically trained in well-controlled laboratory environments with limited generalizability to realistically distorted images.

Blind Image Quality Assessment Learning-To-Rank

How is Gaze Influenced by Image Transformations? Dataset and Model

1 code implementation16 May 2019 Zhaohui Che, Ali Borji, Guangtao Zhai, Xiongkuo Min, Guodong Guo, Patrick Le Callet

Data size is the bottleneck for developing deep saliency models, because collecting eye-movement data is very time consuming and expensive.

Data Augmentation Generative Adversarial Network +1

Adversarial Attacks against Deep Saliency Models

no code implementations2 Apr 2019 Zhaohui Che, Ali Borji, Guangtao Zhai, Suiyi Ling, Guodong Guo, Patrick Le Callet

The proposed attack only requires a part of the model information, and is able to generate a sparser and more insidious adversarial perturbation, compared to traditional image-space attacks.

Adversarial Attack object-detection +1

Invariance Analysis of Saliency Models versus Human Gaze During Scene Free Viewing

1 code implementation10 Oct 2018 Zhaohui Che, Ali Borji, Guangtao Zhai, Xiongkuo Min

Most of current studies on human gaze and saliency modeling have used high-quality stimuli.

Data Augmentation

Terahertz Security Image Quality Assessment by No-reference Model Observers

no code implementations12 Jul 2017 Menghan Hu, Xiongkuo Min, Guangtao Zhai, Wenhan Zhu, Yucheng Zhu, Zhaodi Wang, Xiaokang Yang, Guang Tian

Subsequently, the existing no-reference IQA algorithms, which were 5 opinion-aware approaches viz., NFERM, GMLF, DIIVINE, BRISQUE and BLIINDS2, and 8 opinion-unaware approaches viz., QAC, SISBLIM, NIQE, FISBLIM, CPBD, S3 and Fish_bb, were executed for the evaluation of the THz security image quality.

Image Quality Assessment

Cannot find the paper you are looking for? You can Submit a new open access paper.