1 code implementation • 19 Dec 2024 • Mingdeng Cao, Chong Mou, Ziyang Yuan, Xintao Wang, Zhaoyang Zhang, Ying Shan, Yinqiang Zheng
By fine-tuning existing base diffusion models on human video data, our method demonstrates strong generalization to unseen human identities and poses without requiring additional per-instance fine-tuning.
no code implementations • 16 Dec 2024 • Mingdeng Cao, Xuaner Zhang, Yinqiang Zheng, Zhihao Xia
This paper introduces a novel dataset construction pipeline that samples pairs of frames from videos and uses multimodal large language models (MLLMs) to generate editing instructions for training instruction-based image manipulation models.
no code implementations • 22 May 2024 • Chong Mou, Mingdeng Cao, Xintao Wang, Zhaoyang Zhang, Ying Shan, Jian Zhang
In this paper, we present a novel attempt to Remake a Video (ReVideo) which stands out from existing methods by allowing precise video editing in specific areas through the specification of both content and motion.
1 code implementation • CVPR 2024 • Mingdeng Cao, Sidi Yang, Yujiu Yang, Yinqiang Zheng
Additionally, a multi-distortion flow prediction strategy is integrated to mitigate the issue of inaccurate flow estimation further.
1 code implementation • 28 Mar 2024 • Sidi Yang, Binxiao Huang, Mingdeng Cao, Yatai Ji, Hanzhong Guo, Ngai Wong, Yujiu Yang
Existing enhancement models often optimize for high performance while falling short of reducing hardware inference time and power consumption, especially on edge devices with constrained computing and storage resources.
1 code implementation • CVPR 2024 • Zhen Li, Mingdeng Cao, Xintao Wang, Zhongang Qi, Ming-Ming Cheng, Ying Shan
Recent advances in text-to-image generation have made remarkable progress in synthesizing realistic human photos conditioned on given text prompts.
Ranked #6 on
Diffusion Personalization Tuning Free
on AgeDB
Diffusion Personalization Tuning Free
Text-to-Image Generation
no code implementations • 30 Oct 2023 • Ziyang Yuan, Mingdeng Cao, Xintao Wang, Zhongang Qi, Chun Yuan, Ying Shan
As a result, our CustomNet ensures enhanced identity preservation and generates diverse, harmonious outputs.
1 code implementation • NeurIPS 2023 • Tianhe Wu, Shuwei Shi, Haoming Cai, Mingdeng Cao, Jing Xiao, Yinqiang Zheng, Yujiu Yang
Blind Omnidirectional Image Quality Assessment (BOIQA) aims to objectively assess the human perceptual quality of omnidirectional images (ODIs) without relying on pristine-quality image information.
4 code implementations • ICCV 2023 • Mingdeng Cao, Xintao Wang, Zhongang Qi, Ying Shan, XiaoHu Qie, Yinqiang Zheng
Despite the success in large-scale text-to-image generation and text-conditioned image editing, existing methods still struggle to produce consistent generation and editing results.
Ranked #13 on
Text-based Image Editing
on PIE-Bench
1 code implementation • CVPR 2023 • Fanghua Yu, Xintao Wang, Mingdeng Cao, Gen Li, Ying Shan, Chao Dong
Omnidirectional images (ODIs) have obtained lots of research interest for immersive experiences.
no code implementations • CVPR 2023 • Zhuoxiao Li, Haiyang Jiang, Mingdeng Cao, Yinqiang Zheng
Single-chip polarized color photography provides both visual textures and object surface information in one snapshot.
1 code implementation • CVPR 2023 • Zhihang Zhong, Mingdeng Cao, Xiang Ji, Yinqiang Zheng, Imari Sato
This paper studies the challenging problem of recovering motion from blur, also known as joint deblurring and interpolation or blur temporal super-resolution.
1 code implementation • 28 Aug 2022 • Mingdeng Cao, Zhihang Zhong, Yanbo Fan, Jiahao Wang, Yong Zhang, Jue Wang, Yujiu Yang, Yinqiang Zheng
We believe the novel realistic synthesis pipeline and the corresponding RAW video dataset can help the community to easily construct customized blur datasets to improve real-world video deblurring performance largely, instead of laboriously collecting real data pairs.
1 code implementation • CVPR 2022 • Mingdeng Cao, Zhihang Zhong, Jiahao Wang, Yinqiang Zheng, Yujiu Yang
This paper proposes the first real-world rolling shutter (RS) correction dataset, BS-RSC, and a corresponding model to correct the RS frames in a distorted video.
2 code implementations • 19 Apr 2022 • Sidi Yang, Tianhe Wu, Shuwei Shi, Shanshan Lao, Yuan Gong, Mingdeng Cao, Jiahao Wang, Yujiu Yang
No-Reference Image Quality Assessment (NR-IQA) aims to assess the perceptual quality of images in accordance with human subjective perception.
Ranked #8 on
Video Quality Assessment
on MSU SR-QA Dataset
1 code implementation • 17 Apr 2022 • Mingdeng Cao, Yanbo Fan, Yong Zhang, Jue Wang, Yujiu Yang
For multi-frame temporal modeling, we adapt Transformer to fuse multiple spatial features efficiently.
1 code implementation • 12 Mar 2022 • Zhihang Zhong, Mingdeng Cao, Xiao Sun, Zhirong Wu, Zhongyi Zhou, Yinqiang Zheng, Stephen Lin, Imari Sato
In this paper, instead of two consecutive frames, we propose to exploit a pair of images captured by dual RS cameras with reversed RS directions for this highly challenging task.
1 code implementation • 8 Mar 2022 • Fei Yin, Yong Zhang, Xiaodong Cun, Mingdeng Cao, Yanbo Fan, Xuan Wang, Qingyan Bai, Baoyuan Wu, Jue Wang, Yujiu Yang
Our framework elevates the resolution of the synthesized talking face to 1024*1024 for the first time, even though the training dataset has a lower resolution.
no code implementations • CVPR 2022 • Jiahao Wang, Baoyuan Wu, Rui Su, Mingdeng Cao, Shuwei Shi, Wanli Ouyang, Yujiu Yang
We conduct experiments both from a control theory lens through a phase locus verification and from a network training lens on several models, including CNNs, Transformers, MLPs, and on benchmark datasets.
no code implementations • 7 May 2021 • Jinjin Gu, Haoming Cai, Chao Dong, Jimmy S. Ren, Yu Qiao, Shuhang Gu, Radu Timofte, Manri Cheon, SungJun Yoon, Byungyeon Kang, Junwoo Lee, Qing Zhang, Haiyang Guo, Yi Bin, Yuqing Hou, Hengliang Luo, Jingyu Guo, ZiRui Wang, Hai Wang, Wenming Yang, Qingyan Bai, Shuwei Shi, Weihao Xia, Mingdeng Cao, Jiahao Wang, Yifan Chen, Yujiu Yang, Yang Li, Tao Zhang, Longtao Feng, Yiting Liao, Junlin Li, William Thong, Jose Costa Pereira, Ales Leonardis, Steven McDonagh, Kele Xu, Lehan Yang, Hengxing Cai, Pengfei Sun, Seyed Mehdi Ayyoubzadeh, Ali Royat, Sid Ahmed Fezza, Dounia Hammou, Wassim Hamidouche, Sewoong Ahn, Gwangjin Yoon, Koki Tsubota, Hiroaki Akutsu, Kiyoharu Aizawa
This paper reports on the NTIRE 2021 challenge on perceptual image quality assessment (IQA), held in conjunction with the New Trends in Image Restoration and Enhancement workshop (NTIRE) workshop at CVPR 2021.
3 code implementations • 23 Apr 2021 • Shuwei Shi, Qingyan Bai, Mingdeng Cao, Weihao Xia, Jiahao Wang, Yifan Chen, Yujiu Yang
Image quality assessment (IQA) aims to assess the perceptual quality of images.