To aggregate a new hidden state that contains fewer artifacts from the hidden state pool, we devise a Selective Cross Attention (SCA) module, in which the attention between input features and each hidden state is calculated.
The core of our CAT is the Rectangle-Window Self-Attention (Rwin-SA), which utilizes horizontal and vertical rectangle window attention in different heads parallelly to expand the attention area and aggregate the features cross different windows.
Event cameras are novel bio-inspired vision sensors that output pixel-level intensity changes in microsecond accuracy with a high dynamic range and low power consumption.
In this work, we design an efficient SR network by improving the attention mechanism.
Here we provide the first method that combines the advanced artificial intelligence (AI) techniques and the carbon satellite monitor to quantify anthropogenic CO$_2$ emissions.
We finally use a guided fusion operation to integrate the sharp edges generated by the network and flat areas by the interpolation method to get the final SR image.
This is considered as a dense attention strategy since the interactions of tokens are restrained in dense regions.
In this paper, we rethink the role of alignment in VSR Transformers and make several counter-intuitive observations.
Ranked #2 on Video Super-Resolution on Vid4 - 4x upscaling
This challenge is divided into two tracks, a full-reference IQA track similar to the previous NTIRE IQA challenge and a new track that focuses on the no-reference IQA methods.
One is the usage of blueprint separable convolution (BSConv), which takes place of the redundant convolution operation.
2 code implementations • 20 Apr 2022 • Ren Yang, Radu Timofte, Meisong Zheng, Qunliang Xing, Minglang Qiao, Mai Xu, Lai Jiang, Huaida Liu, Ying Chen, Youcheng Ben, Xiao Zhou, Chen Fu, Pei Cheng, Gang Yu, Junyi Li, Renlong Wu, Zhilu Zhang, Wei Shang, Zhengyao Lv, Yunjin Chen, Mingcai Zhou, Dongwei Ren, Kai Zhang, WangMeng Zuo, Pavel Ostyakov, Vyal Dmitry, Shakarim Soltanayev, Chervontsev Sergey, Zhussip Magauiya, Xueyi Zou, Youliang Yan, Pablo Navarrete Michelini, Yunhua Lu, Diankai Zhang, Shaoli Liu, Si Gao, Biao Wu, Chengjian Zheng, Xiaofeng Zhang, Kaidi Lu, Ning Wang, Thuong Nguyen Canh, Thong Bach, Qing Wang, Xiaopeng Sun, Haoyu Ma, Shijie Zhao, Junlin Li, Liangbin Xie, Shuwei Shi, Yujiu Yang, Xintao Wang, Jinjin Gu, Chao Dong, Xiaodi Shi, Chunmei Nian, Dong Jiang, Jucai Lin, Zhihuai Xie, Mao Ye, Dengyan Luo, Liuhan Peng, Shengjie Chen, Qian Wang, Xin Liu, Boyang Liang, Hang Dong, Yuhao Huang, Kai Chen, Xingbei Guo, Yujing Sun, Huilei Wu, Pengxu Wei, Yulin Huang, Junying Chen, Ik Hyun Lee, Sunder Ali Khowaja, Jiseok Yoon
This challenge includes three tracks.
Our key contribution is to leverage a texture classifier, which enables us to assign patches with semantic labels, to identify the source of SR errors both globally and locally.
Dropout is designed to relieve the overfitting problem in high-level vision tasks but is rarely applied in low-level vision tasks, like image super-resolution (SR).
We show that a well-trained deep SR network is naturally a good descriptor of degradation information.
This paper serves as a systematic review on recent progress in blind image SR, and proposes a taxonomy to categorize existing methods into three different classes according to their ways of degradation modelling and the data used for solving the SR model.
no code implementations • 7 May 2021 • Jinjin Gu, Haoming Cai, Chao Dong, Jimmy S. Ren, Yu Qiao, Shuhang Gu, Radu Timofte, Manri Cheon, SungJun Yoon, Byungyeon Kang, Junwoo Lee, Qing Zhang, Haiyang Guo, Yi Bin, Yuqing Hou, Hengliang Luo, Jingyu Guo, ZiRui Wang, Hai Wang, Wenming Yang, Qingyan Bai, Shuwei Shi, Weihao Xia, Mingdeng Cao, Jiahao Wang, Yifan Chen, Yujiu Yang, Yang Li, Tao Zhang, Longtao Feng, Yiting Liao, Junlin Li, William Thong, Jose Costa Pereira, Ales Leonardis, Steven McDonagh, Kele Xu, Lehan Yang, Hengxing Cai, Pengfei Sun, Seyed Mehdi Ayyoubzadeh, Ali Royat, Sid Ahmed Fezza, Dounia Hammou, Wassim Hamidouche, Sewoong Ahn, Gwangjin Yoon, Koki Tsubota, Hiroaki Akutsu, Kiyoharu Aizawa
This paper reports on the NTIRE 2021 challenge on perceptual image quality assessment (IQA), held in conjunction with the New Trends in Image Restoration and Enhancement workshop (NTIRE) workshop at CVPR 2021.
In this work, we attempt to quantify and visualize attention mechanisms in SISR and show that not all attention modules are equally beneficial.
To answer the questions and promote the development of IQA methods, we contribute a large-scale IQA dataset, called Perceptual Image Processing ALgorithms (PIPAL) dataset.
no code implementations • 14 Sep 2020 • Dario Fuoli, Zhiwu Huang, Shuhang Gu, Radu Timofte, Arnau Raventos, Aryan Esfandiari, Salah Karout, Xuan Xu, Xin Li, Xin Xiong, Jinge Wang, Pablo Navarrete Michelini, Wen-Hao Zhang, Dongyang Zhang, Hanwei Zhu, Dan Xia, Haoyu Chen, Jinjin Gu, Zhi Zhang, Tongtong Zhao, Shanshan Zhao, Kazutoshi Akita, Norimichi Ukita, Hrishikesh P. S, Densen Puthussery, Jiji C. V
Missing information can be restored well in this region, especially in HR videos, where the high-frequency content mostly consists of texture details.
To answer these questions and promote the development of IQA methods, we contribute a large-scale IQA dataset, called Perceptual Image Processing Algorithms (PIPAL) dataset.
Such an over-parameterization of the latent space significantly improves the image reconstruction quality, outperforming existing competitors.
Ranked #6 on Blind Face Restoration on CelebA-Test
In this work, we propose a novel framework, called InterFaceGAN, for semantic face editing by interpreting the latent semantics learned by GANs.
Large deep networks have demonstrated competitive performance in single image super-resolution (SISR), with a huge volume of data involved.
Such a mixture problem is usually solved by a sequential solution (applying each method independently in a fixed order: DM $\to$ DN $\to$ SR), or is simply tackled by an end-to-end network without enough analysis into interactions among tasks, resulting in an undesired performance drop in the final image quality.
In this paper, we propose an Iterative Kernel Correction (IKC) method for blur kernel estimation in blind SR problem, where the blur kernels are unknown.
Ranked #2 on Blind Super-Resolution on Set5 - 3x upscaling
Generating plausible hair image given limited guidance, such as sparse sketches or low-resolution image, has been made possible with the rise of Generative Adversarial Networks (GANs).
In this paper, we present the problem formulation and methodology framework of Super-Resolution Perception (SRP) on industrial sensor data.
To further enhance the visual quality, we thoroughly study three key components of SRGAN - network architecture, adversarial loss and perceptual loss, and improve each of them to derive an Enhanced SRGAN (ESRGAN).
Ranked #2 on Face Hallucination on FFHQ 512 x 512 - 16x upscaling
Image of a scene captured through a piece of transparent and reflective material, such as glass, is often spoiled by a superimposed layer of reflection image.