In this paper, we propose multi-agent automated machine learning (MA2ML) with the aim to effectively handle joint optimization of modules in automated machine learning (AutoML).
Current point-cloud detection methods have difficulty detecting the open-vocabulary objects in the real world, due to their limited generalization capability.
In this article, we propose a simulated crowd counting dataset CrowdX, which has a large scale, accurate labeling, parameterized realization, and high fidelity.
Our method can integrate the pedestrian's head and body information to enhance the feature expression ability of the density map.
This paper implemented the Transformer model and conditional variational autoencoder (CVAE) to the graphic design layout generation task.
We combine these two methods and demonstrate their effectiveness on both CNN-based neural networks and WGAN-based neural networks with comprehensive experiments.
Compared with the existing practice of feature concatenation, we find that uncovering the correlation among the three factors is a superior way of leveraging the pivotal contextual cues provided by edges and poses.
The FFA-Net architecture consists of three key components: 1) A novel Feature Attention (FA) module combines Channel Attention with Pixel Attention mechanism, considering that different channel-wise features contain totally different weighted information and haze distribution is uneven on the different image pixels.
Ranked #1 on Image Dehazing on KITTI
Blind image deblurring is a challenging problem in computer vision, which aims to restore both the blur kernel and the latent sharp image from only a blurry observation.
Person re-identification (ReID) is a challenging task due to arbitrary human pose variations, background clutters, etc.
In this paper, we propose a novel tracklet processing method to cleave and re-connect tracklets on crowd or long-term occlusion by Siamese Bi-Gated Recurrent Unit (GRU).
Ranked #20 on Multi-Object Tracking on MOT16
In this letter, we address the problem of estimating Gaussian noise level from the trained dictionaries in update stage.
Color artifacts of demosaicked images are often found at contours due to interpolation across edges and cross-channel aliasing.
The multiscale dictionary is considered as the product of oscillating dictionary and tolerance dictionary.
Aerial images are often degraded by space-varying motion blur and simultaneous uneven illumination.
In this letter, we propose a novel image denoising method based on correlation preserving sparse coding.