Drawing lessons from lattice filter bank, we design the lattice block (LB) in which two butterfly structures are applied to combine two RBs.
To tackle the above issues, we propose a novel Local-Global Anti-forgetting (LGA) model to address local and global catastrophic forgetting on old categories, which is a pioneering work to explore a global class-incremental model in the FL feld.
The effectiveness of the method is also demonstrated on the real-world SR setting.
It consists of a knowledge distillation based implicit degradation estimator network (KD-IDE) and an efficient SR network.
In the second stage, the LT-based global fusion and INN-based local fusion layers output the fused image.
Hyperspectral image (HSI) denoising is a crucial preprocessing procedure for the subsequent HSI applications.
The core of our CAT is the Rectangle-Window Self-Attention (Rwin-SA), which utilizes horizontal and vertical rectangle window attention in different heads parallelly to expand the attention area and aggregate the features cross different windows.
We finally use a guided fusion operation to integrate the sharp edges generated by the network and flat areas by the interpolation method to get the final SR image.
This is considered as a dense attention strategy since the interactions of tokens are restrained in dense regions.
In this study, we reconsider components in binary convolution, such as residual connection, BatchNorm, activation function, and structure, for IR tasks.
By observing this physical encoding procedure, two major challenges stand in the way of a high-fidelity reconstruction.
Although some studies attempt to train deep models on noisy and noise-free video pairs captured by cameras, such models can only work well for specific cameras and do not generalize well for other videos.
Ranked #1 on Video Denoising on VideoLQ
The most of CNN based super-resolution (SR) methods assume that the degradation is known (\eg, bicubic).
Reference-based image super-resolution (RefSR) aims to exploit auxiliary reference (Ref) images to super-resolve low-resolution (LR) images.
These issues can be alleviated by a cascade of three separate sub-tasks, including video deblurring, frame interpolation, and super-resolution, which, however, would fail to capture the spatial and temporal correlations among video sequences.
In SSL, we design pruning schemes for several key components in VSR models, including residual blocks, recurrent networks, and upsampling networks.
In coded aperture snapshot spectral compressive imaging (CASSI) systems, hyperspectral image (HSI) reconstruction methods are employed to recover the spatial-spectral signal from a compressed measurement.
On the other hand, we equip the sequence-to-sequence model with an unsupervised optical flow estimator to maximize its potential.
Ranked #2 on Video Enhancement on MFQE v2
Existing leading methods for spectral reconstruction (SR) focus on designing deeper or wider convolutional neural networks (CNNs) to learn the end-to-end mapping from the RGB image to its hyperspectral image (HSI).
Additionally, for better noise fitting, we present an efficient architecture Simple Multi-scale Network (SMNet) as the generator.
Ranked #2 on Image Denoising on SIDD (using extra training data)
Improving the resolution of magnetic resonance (MR) image data is critical to computer-aided diagnosis and brain function analysis.
While recent years have witnessed a dramatic upsurge of exploiting deep neural networks toward solving image denoising, existing methods mostly rely on simple noise assumptions, such as additive white Gaussian noise (AWGN), JPEG compression noise and camera sensor noise, and a general-purpose blind denoising method for real images remains unsolved.
In this paper, we propose a novel Hybrid Pixel-Unshuffled Network (HPUN) by introducing an efficient and effective downsampling module into the SR task.
Many algorithms have been developed to solve the inverse problem of coded aperture snapshot spectral imaging (CASSI), i. e., recovering the 3D hyperspectral images (HSIs) from a 2D compressive measurement.
On the one hand, the proposed HR spatial-spectral attention module with its efficient feature fusion provides continuous and fine pixel-level features.
Image super-resolution (SR) is a fast-moving field with novel architectures attracting the spotlight.
Exploiting similar and sharper scene patches in spatio-temporal neighborhoods is critical for video deblurring.
Ranked #1 on Deblurring on DVD
Our key contribution is to leverage a texture classifier, which enables us to assign patches with semantic labels, to identify the source of SR errors both globally and locally.
Recently, hyperspectral imaging (HSI) has attracted increasing research attention, especially for the ones based on a coded aperture snapshot spectral imaging (CASSI) system.
In a Hearthstone deckbuilding case study, we show that our approach improves the sample efficiency of MAP-Elites and outperforms a model trained offline with random decks, as well as a linear surrogate model baseline, setting a new state-of-the-art for quality diversity approaches in automated Hearthstone deckbuilding.
To address the above issues, we propose aligned structured sparsity learning (ASSL), which introduces a weight normalization layer and applies $L_2$ regularization to the scale parameters for sparsity.
For slow learning of graph similarity, this paper proposes a novel early-fusion approach by designing a co-attention-based feature fusion network on multilevel GNN features.
The HSI representations are highly similar and correlated across the spectral dimension.
We take a fresh look at this problem, by considering a setting in which the robot is limited to storing that knowledge and experience only in the form of learned skill policies.
Specifically, for the layers connected by the same residual, we select the filters of the same indices as unimportant filters.
As the inverse process of snapshot compressive imaging, the hyperspectral image (HSI) reconstruction takes the 2D measurement as input and posteriorly retrieves the captured 3D spatial-spectral signal.
In this paper, we propose a novel framework, MemREIN, which considers Memorized, Restitution, and Instance Normalization for cross-domain few-shot learning.
The emerging technology of snapshot compressive imaging (SCI) enables capturing high dimensional (HD) data in an efficient way.
We explore possible methods for multi-task transfer learning which seek to exploit the shared physical structure of robotics tasks.
The intuition is that gradient with momentum contains more accurate directional information and therefore its second moment estimation is a more favorable option for learning rate scaling than that of the raw gradient.
When studying robots collaborating with humans, much of the focus has been on robot policies that coordinate fluently with human teammates in collaborative tasks.
The extraction of auto-correlation in images has shown great potential in deep learning networks, such as the self-attention mechanism in the channel domain and the self-similarity mechanism in the spatial domain.
They also fail to sense the entire space of the input, which is critical for high-quality MR image SR. To address those problems, we propose squeeze and excitation reasoning attention networks (SERAN) for accurate MR image SR. We propose to squeeze attention from global spatial information of the input and obtain global descriptors.
Ranked #2 on Super-Resolution on IXI
This paper studies Semi-Supervised Domain Adaptation (SSDA), a practical yet under-investigated research topic that aims to learn a model of good performance using unlabeled samples and a few labeled samples in the target domain, with the help of labeled samples from a source domain.
A na\"ive method is to decompose it into two sub-tasks: video frame interpolation (VFI) and video super-resolution (VSR).
Neural network pruning typically removes connections or neurons from a pretrained converged model; while a new pruning paradigm, pruning at initialization (PaI), attempts to prune a randomly initialized network.
GAN inversion aims to invert a given image back into the latent space of a pretrained GAN model, for the image to be faithfully reconstructed from the inverted code by the generator.
Specifically, we propose a dynamic high-pass filtering (HPF) module that locally applies adaptive filter weights for each spatial location and channel group to preserve high-frequency signals.
However, the basic convolutional layer in CNNs is designed to extract local patterns, lacking the ability to model global context.
Inspired by the robustness and efficiency of sparse representation in sparse coding based image restoration models, we investigate the sparsity of neurons in deep networks.
Self-similarity refers to the image prior widely used in image restoration algorithms that small but similar patterns tend to occur at different locations and scales.
The recent flourish of deep learning in various tasks is largely accredited to the rich and accessible labeled data.
Rather than synthesizing missing LR video frames as VFI networks do, we firstly temporally interpolate LR frame features in missing LR video frames capturing local temporal contexts by the proposed feature temporal interpolation network.
Ranked #4 on Video Frame Interpolation on Vid4 - 4x upscaling
We aim to super-resolve digital paintings, synthesizing realistic details from high-resolution reference painting materials for very large scaling factors (e. g., 8X, 16X).
As for SR, the proposed method recovers sharper edges and more details from LR face images than other state-of-the-art methods, which we demonstrate qualitatively and quantitatively.
It outperforms the current best method by 6. 8% relatively for image retrieval and 4. 8% relatively for caption retrieval on MS-COCO (Recall@1 using 1K test set).
Ranked #7 on Image Retrieval on Flickr30K 1K test
Most current image super-resolution (SR) methods based on convolutional neural networks (CNNs) use residual learning in network structural design, which favors to effective back propagation and hence improves SR performance by increasing model scale.
An assumption widely used in recent neural style transfer methods is that image styles can be described by global statics of deep features like Gram or covariance matrices.
To address this issue, we design local and non-local attention blocks to extract features that capture the long-range dependencies between pixels and pay more attention to the challenging parts.
We fully exploit the hierarchical features from all the convolutional layers.
Video super-resolution (VSR) aims to restore a photo-realistic high-resolution (HR) video frame from both its corresponding low-resolution (LR) frame (reference frame) and multiple neighboring frames (supporting frames).
The proposed CSN model divides the hierarchical features into two branches, i. e., residual branch and dense branch, with different information transmissions.
Ranked #3 on Super-Resolution on IXI
To ensure scalability and separability, a softmax-like function is formulated to push apart the positive and negative support sets.
To solve these problems, we propose the very deep residual channel attention networks (RCAN).
Ranked #14 on Image Super-Resolution on BSD100 - 4x upscaling
In this paper, we propose a novel residual dense network (RDN) to address this problem in image SR. We fully exploit the hierarchical features from all the convolutional layers.
Ranked #3 on Color Image Denoising on CBSD68 sigma50