no code implementations • 13 Dec 2020 • Zhengxiong Luo, Zhicheng Wang, Yuanhao Cai, GuanAn Wang, Yan Huang, Liang Wang, Erjin Zhou, Tieniu Tan, Jian Sun
Instead, we focus on exploiting multi-scale information from layers with different receptive-field sizes and then making full of use this information by improving the fusion method.
no code implementations • 6 Jul 2021 • Shang Li, GuiXuan Zhang, Zhengxiong Luo, Jie Liu, Zhi Zeng, Shuwu Zhang
As a result, most previous methods may suffer a performance drop when the degradations of test images are unknown and various (i. e. the case of blind SR).
no code implementations • 22 Jul 2021 • Zhengxiong Luo, Zhicheng Wang, Yan Huang, Liang Wang, Tieniu Tan, Erjin Zhou
It can generate and fuse multi-scale features of the same spatial sizes by setting different dilation rates for different channels.
no code implementations • 9 Nov 2021 • Shang Li, GuiXuan Zhang, Zhengxiong Luo, Jie Liu, Zhi Zeng, Shuwu Zhang
In this paper, instead of directly applying the LR guidance, we propose an additional invertible flow guidance module (FGM), which can transform the downscaled representation to the visually plausible image during downscaling and transform it back during upscaling.
1 code implementation • CVPR 2021 • Zhengxiong Luo, Zhicheng Wang, Yan Huang, Tieniu Tan, Erjin Zhou
However, for bottom-up methods, which need to handle a large variance of human scales and labeling ambiguities, the current practice seems unreasonable.
1 code implementation • CVPR 2022 • Zhengxiong Luo, Yan Huang, Shang Li, Liang Wang, Tieniu Tan
Compared with previous deterministic degradation models, PDM could model more diverse degradations and generate HR-LR pairs that may better cover the various degradations of test images, and thus prevent the SR model from over-fitting to specific ones.
1 code implementation • NeurIPS 2020 • Zhengxiong Luo, Yan Huang, Shang Li, Liang Wang, Tieniu Tan
More importantly, \textit{Restorer} is trained with the kernel estimated by \textit{Estimator}, instead of ground-truth kernel, thus \textit{Restorer} could be more tolerant to the estimation error of \textit{Estimator}.
Ranked #2 on Blind Super-Resolution on Set5 - 2x upscaling
1 code implementation • 14 May 2021 • Zhengxiong Luo, Yan Huang, Shang Li, Liang Wang, Tieniu Tan
More importantly, \textit{Restorer} is trained with the kernel estimated by \textit{Estimator}, instead of the ground-truth kernel, thus \textit{Restorer} could be more tolerant to the estimation error of \textit{Estimator}.
Ranked #2 on Blind Super-Resolution on DIV2KRK - 4x upscaling
2 code implementations • 17 Aug 2023 • Zhengxiong Luo, Yan Huang, Shang Li, Liang Wang, Tieniu Tan
To address this issue, instead of considering these two problems independently, we adopt an alternating optimization algorithm, which can estimate the degradation and restore the SR image in a single model.
1 code implementation • 20 Dec 2023 • Quan Sun, Yufeng Cui, Xiaosong Zhang, Fan Zhang, Qiying Yu, Zhengxiong Luo, Yueze Wang, Yongming Rao, Jingjing Liu, Tiejun Huang, Xinlong Wang
The human ability to easily solve multimodal tasks in context (i. e., with only a few demonstrations or simple instructions), is what current multimodal systems have largely struggled to imitate.
Ranked #21 on Visual Question Answering on MM-Vet
4 code implementations • ECCV 2020 • Yuanhao Cai, Zhicheng Wang, Zhengxiong Luo, Binyi Yin, Angang Du, Haoqian Wang, Xiangyu Zhang, Xinyu Zhou, Erjin Zhou, Jian Sun
To tackle this problem, we propose an efficient attention mechanism - Pose Refine Machine (PRM) to make a trade-off between local and global representations in output features and further refine the keypoint locations.
Ranked #1 on Keypoint Detection on COCO test-challenge
1 code implementation • CVPR 2023 • Zhengxiong Luo, Dayou Chen, Yingya Zhang, Yan Huang, Liang Wang, Yujun Shen, Deli Zhao, Jingren Zhou, Tieniu Tan
A diffusion probabilistic model (DPM), which constructs a forward diffusion process by gradually adding noise to data points and learns the reverse denoising process to generate new samples, has been shown to handle complex data distribution.
Ranked #7 on Video Generation on UCF-101