no code implementations • 24 Feb 2025 • Kang Fu, Huiyu Duan, ZiCheng Zhang, Xiaohong Liu, Xiongkuo Min, Jia Wang, Guangtao Zhai
This database encompasses 969 validated 3D assets generated from 170 prompts via 6 popular text-to-3D asset generation models, and corresponding subjective quality ratings for these assets from the perspectives of quality, authenticity, and text-asset correspondence, respectively.
no code implementations • 5 Jan 2025 • Hui Li, Xiaoyu Ren, Hongjiu Yu, Huiyu Duan, Kai Li, Ying Chen, Libo Wang, Xiongkuo Min, Guangtao Zhai, Xu Liu
Furthermore, a multi-modal FAP method is proposed to measure the facial attractiveness in live streaming.
no code implementations • 2 Jan 2025 • Zitong Xu, Huiyu Duan, Guangji Ma, Liu Yang, Jiarui Wang, Qingbo Wu, Xiongkuo Min, Guangtao Zhai, Patrick Le Callet
To address the issue and facilitate the advancement of IHAs, we introduce the first Image Quality Assessment Database for image Harmony evaluation (HarmonyIQAD), which consists of 1, 350 harmonized images generated by 9 different IHAs, and the corresponding human visual preference scores.
no code implementations • 29 Dec 2024 • Xilei Zhu, Huiyu Duan, Liu Yang, Yucheng Zhu, Xiongkuo Min, Guangtao Zhai, Patrick Le Callet
With the rapid development of eXtended Reality (XR), egocentric spatial shooting and display technologies have further enhanced immersion and engagement for users.
no code implementations • 26 Dec 2024 • Huiyu Duan, Qiang Hu, Jiarui Wang, Liu Yang, Zitong Xu, Lu Liu, Xiongkuo Min, Chunlei Cai, Tianxiao Ye, Xiaoyun Zhang, Guangtao Zhai
The rapid growth of user-generated content (UGC) videos has produced an urgent need for effective video quality assessment (VQA) algorithms to monitor video quality and guide optimization and recommendation procedures.
no code implementations • 17 Dec 2024 • Lu Liu, Huiyu Duan, Qiang Hu, Liu Yang, Chunlei Cai, Tianxiao Ye, Huayu Liu, Xiaoyun Zhang, Guangtao Zhai
The FaceQ database comprises 12, 255 images generated by 29 models across three tasks: (1) face generation, (2) face customization, and (3) face restoration.
1 code implementation • 26 Nov 2024 • Jiarui Wang, Huiyu Duan, Guangtao Zhai, Juntong Wang, Xiongkuo Min
The rapid advancement of large multimodal models (LMMs) has led to the rapid expansion of artificial intelligence generated videos (AIGVs), which highlights the pressing need for effective video quality assessment (VQA) models designed specifically for AIGVs.
no code implementations • 10 Oct 2024 • Sijing Wu, Yunhao Li, Yichao Yan, Huiyu Duan, Ziwei Liu, Guangtao Zhai
To fill this gap, we first construct a large-scale multi-modal 3D facial animation dataset, MMHead, which consists of 49 hours of 3D facial motion sequences, speech audios, and rich hierarchical text annotations.
1 code implementation • 23 Sep 2024 • Andrey Moskalenko, Alexey Bryncev, Dmitry Vatolin, Radu Timofte, Gen Zhan, Li Yang, Yunlong Tang, Yiting Liao, Jiongzhi Lin, Baitao Huang, Morteza Moradi, Mohammad Moradi, Francesco Rundo, Concetto Spampinato, Ali Borji, Simone Palazzo, Yuxin Zhu, Yinan Sun, Huiyu Duan, Yuqin Cao, Ziheng Jia, Qiang Hu, Xiongkuo Min, Guangtao Zhai, Hao Fang, Runmin Cong, Xiankai Lu, Xiaofei Zhou, Wei zhang, Chunyu Zhao, Wentao Mu, Tao Deng, Hamed R. Tavakoli
The goal of the participants was to develop a method for predicting accurate saliency maps for the provided set of video sequences.
no code implementations • 10 Aug 2024 • Yuxin Zhu, Huiyu Duan, Kaiwei Zhang, Yucheng Zhu, Xilei Zhu, Long Teng, Xiongkuo Min, Guangtao Zhai
To advance the research on audio-visual saliency prediction for ODVs, we further establish a new benchmark based on the AVS-ODV database by testing numerous state-of-the-art saliency models, including visual-only models and audio-visual models.
no code implementations • 31 Jul 2024 • Xilei Zhu, Liu Yang, Huiyu Duan, Xiongkuo Min, Guangtao Zhai, Patrick Le Callet
In this paper, we establish the Egocentric Spatial Images Quality Assessment Database (ESIQAD), the first IQA database dedicated for egocentric spatial images as far as we know.
1 code implementation • 30 Jul 2024 • Huiyu Duan, Xiongkuo Min, Sijing Wu, Wei Shen, Guangtao Zhai
In this paper, we propose a text-induced unified image processor for low-level vision tasks, termed UniProcessor, which can effectively process various degradation types and levels, and support multimodal control.
no code implementations • 22 Jun 2024 • Shiqi Gao, Huiyu Duan, Xinyue Li, Kang Fu, Yicong Peng, Qihang Xu, Yuanyuan Chang, Jia Wang, Xiongkuo Min, Guangtao Zhai
In this paper, we propose a quality-guided image enhancement paradigm that enables image enhancement models to learn the distribution of images with various quality ratings.
no code implementations • 15 May 2024 • Jiajie Teng, Huiyu Duan, Yucheng Zhu, Sijing Wu, Guangtao Zhai
However, at present, the background music of short videos is generally chosen by the video producer, and there is a lack of automatic music recommendation methods for short videos.
1 code implementation • 12 May 2024 • Jiarui Wang, Huiyu Duan, Guangtao Zhai, Xiongkuo Min
Artificial Intelligence Generated Content (AIGC) has grown rapidly in recent years, among which AI-based image generation has gained widespread attention due to its efficient and imaginative image creation ability.
no code implementations • 11 Apr 2024 • Yinan Sun, Xiongkuo Min, Huiyu Duan, Guangtao Zhai
Finally, considering the effect of text descriptions on visual attention, while most existing saliency models ignore this impact, we further propose a text-guided saliency (TGSal) prediction model, which extracts and integrates both image features and text features to predict the image saliency under various text-description conditions.
1 code implementation • 1 Apr 2024 • Liu Yang, Huiyu Duan, Long Teng, Yucheng Zhu, Xiaohong Liu, Menghan Hu, Xiongkuo Min, Guangtao Zhai, Patrick Le Callet
Finally, we conduct a benchmark experiment to evaluate the performance of state-of-the-art IQA models on our database.
no code implementations • 5 Feb 2024 • Xiongkuo Min, Huiyu Duan, Wei Sun, Yucheng Zhu, Guangtao Zhai
Perceptual video quality assessment plays a vital role in the field of video processing due to the existence of quality degradations introduced in various stages of video signal acquisition, compression, transmission and display.
no code implementations • 9 Nov 2023 • Yuxin Zhu, Xilei Zhu, Huiyu Duan, Jie Li, Kaiwei Zhang, Yucheng Zhu, Li Chen, Xiongkuo Min, Guangtao Zhai
Visual saliency prediction for omnidirectional videos (ODVs) has shown great significance and necessity for omnidirectional videos to help ODV coding, ODV transmission, ODV rendering, etc..
1 code implementation • 20 Jul 2023 • Xilei Zhu, Huiyu Duan, Yuqin Cao, Yuxin Zhu, Yucheng Zhu, Jing Liu, Li Chen, Xiongkuo Min, Guangtao Zhai
Omnidirectional videos (ODVs) play an increasingly important role in the application fields of medical, education, advertising, tourism, etc.
1 code implementation • 1 Jul 2023 • Jiarui Wang, Huiyu Duan, Jing Liu, Shi Chen, Xiongkuo Min, Guangtao Zhai
In this paper, in order to get a better understanding of the human visual preferences for AIGIs, a large-scale IQA database for AIGC is established, which is named as AIGCIQA2023.
1 code implementation • 30 Mar 2023 • Huiyu Duan, Wei Shen, Xiongkuo Min, Danyang Tu, Long Teng, Jia Wang, Guangtao Zhai
Recently, masked autoencoders (MAE) for feature pre-training have further unleashed the potential of Transformers, leading to state-of-the-art performances on various high-level vision tasks.
Ranked #4 on
Image Defocus Deblurring
on DPD (Dual-view)
no code implementations • 6 Jul 2022 • Huiyu Duan, Guangtao Zhai, Xiongkuo Min, Yucheng Zhu, Yi Fang, Xiaokang Yang
The original and distorted omnidirectional images, subjective quality ratings, and the head and eye movement data together constitute the OIQA database.
1 code implementation • 18 Apr 2022 • Huiyu Duan, Wei Shen, Xiongkuo Min, Danyang Tu, Jing Li, Guangtao Zhai
Therefore, in this paper, we mainly analyze the interaction effect between background (BG) scenes and AR contents, and study the saliency prediction problem in AR.
1 code implementation • 11 Apr 2022 • Huiyu Duan, Xiongkuo Min, Yucheng Zhu, Guangtao Zhai, Xiaokang Yang, Patrick Le Callet
An objective metric termed CFIQA is also proposed to better evaluate the confusing image quality.
no code implementations • 20 Mar 2022 • Danyang Tu, Xiongkuo Min, Huiyu Duan, Guodong Guo, Guangtao Zhai, Wei Shen
Iwin Transformer is a hierarchical Transformer which progressively performs token representation learning and token agglomeration within irregular windows.
no code implementations • CVPR 2022 • Danyang Tu, Xiongkuo Min, Huiyu Duan, Guodong Guo, Guangtao Zhai, Wei Shen
In contrast, we redefine the HGT detection task as detecting human head locations and their gaze targets, simultaneously.