To this end, we present a review in the form of a taxonomy on existing works of skeleton-based action recognition.
Over parameterization is a common technique in deep learning to help models learn and generalize sufficiently to the given task; nonetheless, this often leads to enormous network structures and consumes considerable computing resources during training.
Inspired by the human cognitive system, attention is a mechanism that imitates the human cognitive awareness about specific information, amplifying critical details to focus more on the essential aspects of data.
Consequently, we present a new SOD perspective of generating RGB-D SOD without acquiring depth data during training and testing and assist RGB methods with depth clues for improved performance.
The best Mean Average Precision (mAP@0. 5) of 98. 8% for vehicle type recognition, 98. 5% for license plate detection, and 98. 3% for license plate reading is achieved by YOLOv4, while its lighter version, i. e., Tiny YOLOv4 obtained a mAP of 97. 1%, 97. 4%, and 93. 7% on vehicle type recognition, license plate detection, and license plate reading, respectively.
Though Transformer has occupied various computer vision tasks, directly leveraging Transformer for image dehazing is challenging: 1) it tends to result in ambiguous and coarse details that are undesired for image reconstruction; 2) previous position embedding of Transformer is provided in logic or spatial position order that neglects the variational haze densities, which results in the sub-optimal dehazing performance.
Object detection in three-dimensional (3D) space attracts much interest from academia and industry since it is an essential task in AI-driven applications such as robotics, autonomous driving, and augmented reality.
There are 2000 reference restored images and 6003 original underwater images in the unpaired training set.
Such GNNs are incapable of learning relative positions between graph nodes within a graph.
The prevalent convolutional neural network (CNN) based image denoising methods extract features of images to restore the clean ground truth, achieving high denoising accuracy.
Recent skeleton-based action recognition methods extract features from 3D joint coordinates as spatial-temporal cues, using these representations in a graph neural network for feature fusion to boost recognition performance.
Ranked #4 on Skeleton Based Action Recognition on NTU RGB+D 120
As a result, our network can effectively improve the visual quality of underwater images by exploiting multiple color spaces embedding and the advantages of both physical model-based and learning-based methods.
Ranked #2 on Underwater Image Restoration on LSUI (using extra training data)
InvDN transforms the noisy input into a low-resolution clean image and a latent representation containing noise.
Underwater image restoration attracts significant attention due to its importance in unveiling the underwater world.
Given the prominence of current 3D sensors, a fine-grained analysis on the basic point cloud data is worthy of further investigation.
Ranked #6 on Semantic Segmentation on S3DIS
In this paper, inspired by the best background/foreground separation abilities of deformable convolutions, we employ them in our Densely Deformable Network (DDNet) to achieve efficient SOD.
Ranked #2 on RGB-D Salient Object Detection on SIP (Average MAE metric, using extra training data)
Identifying the information lossless condition for deep neural architectures is important, because tasks such as image restoration require keep the detailed information of the input data as much as possible.
Our framework includes two main models: 1) a generator model, which maps the input image and latent variable to stochastic saliency prediction, and 2) an inference model, which gradually updates the latent variable by sampling it from the true or approximate posterior distribution.
Ranked #1 on RGB Salient Object Detection on DUTS-test (MAE metric)
Image colorization is the process of estimating RGB colors for grayscale images or video frames to improve their aesthetic and perceptual quality.
Our DRNet is designed to learn local point features from the point cloud in different resolutions.
Ranked #17 on 3D Part Segmentation on ShapeNet-Part
Furthermore, the evaluation in terms of quantitative metrics and visual quality for four restoration tasks i. e. Denoising, Super-resolution, Raindrop Removal, and JPEG Compression on 11 real degraded datasets against more than 30 state-of-the-art algorithms demonstrate the superiority of our R$^2$Net.
We propose to learn a fully-convolutional network model that consists of a Chain of Identity Mapping Modules and residual on the residual architecture for image denoising.
Additionally, to the best of our knowledge, our method is the first specialized method to super-resolve mosaic images, whether it be multi-spectral or Bayer.
In this paper, we propose the first framework (UCNet) to employ uncertainty for RGB-D saliency detection by learning from the data labeling process.
Ranked #4 on RGB-D Salient Object Detection on LFSD
In this work, we investigate the performance of the landmark general CNN classifiers, which presented top-notch results on large scale classification datasets, on the fine-grained datasets, and compare it against state-of-the-art fine-grained classifiers.
As the basic task of point cloud analysis, classification is fundamental but always challenging.
Ranked #14 on 3D Point Cloud Classification on ModelNet40
This study highlights the importance of deep learning for the analysis of fluorescence microscopy protein imagery.
This mosaic image is then merged with the mosaic image generated by the SR network to produce a quantitatively superior image.
We perform the classification of ancient Roman Republican coins via recognizing their reverse motifs where various objects, faces, scenes, animals, and buildings are minted along with legends.
In this paper, our main aim is two-fold, 1): to provide a comprehensive and in-depth survey of the deep learning-based underwater image enhancement, which covers various perspectives ranging from algorithms to open issues, and 2): to conduct a qualitative and quantitative comparison of the deep algorithms on diverse datasets to serve as a benchmark, which has been barely explored before.
Super-Resolution convolutional neural networks have recently demonstrated high-quality restoration for single images.
Ranked #1 on Image Super-Resolution on BSD100 - 8x upscaling
Deep convolutional networks based super-resolution is a fast-growing field with numerous practical applications.
Deep convolutional neural networks perform better on images containing spatially invariant noise (synthetic noise); however, their performance is limited on real-noisy photographs and requires multiple stage network modeling.
Ranked #1 on Color Image Denoising on BSD68 sigma15
In an underwater scene, wavelength-dependent light absorption and scattering degrade the visibility of images, causing low contrast and distorted color casts.
We propose to learn a fully-convolutional network model that consists of a Chain of Identity Mapping Modules (CIMM) for image denoising.