When deploying a robot to a new task, one often has to train it to detect novel objects, which is time-consuming and labor-intensive.
This is mainly because the correlation volume, the basis of pixel matching, is computed as the dot product of the convolutional features of the two images.
Ranked #7 on Optical Flow Estimation on KITTI 2015 (train)
The proposed method learns keypoints from camera images as the state representation, through a self-supervised autoencoder architecture.
Given an initial pose and the generated whole-body grasping pose as the starting and ending poses of the motion respectively, we design a novel contact-aware generative motion infilling module to generate a diverse set of grasp-oriented motions.
With the development of Edge Computing and Artificial Intelligence (AI) technologies, edge devices are witnessed to generate data at unprecedented volume.
Gradient-based methods for two-player games produce rich dynamics that can solve challenging problems, yet can be difficult to stabilize and understand.
To address this problem, a novel end-to-end supervised classification method is proposed for HR SAR images by considering both spatial context and statistical features.
Modern solutions to the single image super-resolution (SISR) problem using deep neural networks aim not only at better performance accuracy but also at a lighter and computationally efficient model.
From this perspective, we hypothesise that instabilities in training GANs arise from the integration error in discretising the continuous dynamics.
To address this problem, we first introduce a geometrically rich and diverse SPD neural architecture search space for an efficient SPD cell design.
We design a light-weight and powerful backbone with dense connectivity to facilitate feature reuse throughout the whole network and the proposed Dual-Path module (DPM) to sufficiently aggregate multi-scale contexts.
This paper aims at enlarging the problem of Neural Architecture Search (NAS) from Single-Path and Multi-Path Search to automated Mixed-Path Search.
The proposed method is able to efficiently generalize the previously learned task by model fusion to solve the environment adaptation problem.
Training generative adversarial networks requires balancing of delicate adversarial dynamics.
Ranked #1 on Conditional Image Generation on ImageNet 128x128
CV-FCN employs a complex downsampling-then-upsampling scheme to extract dense features.
To our best knowledge, this is the first work to explore effective intra- and inter-modality fusion in 6D pose estimation.
Real-time multi-target path planning is a key issue in the field of autonomous driving.
Vegetation is the natural linkage connecting soil, atmosphere and water.
Based on the idea of memory writing as inference, as proposed in the Kanerva Machine, we show that a likelihood-based Lyapunov function emerges from maximising the variational lower-bound of a generative memory.
Humans spend a remarkable fraction of waking life engaged in acts of "mental time travel".
We propose the DFNet and make two main contributions, one is dynamic loss weights, and the other is residual fusion block (RFB).
At the same time, we proposed a highly fused convolutional network (HFCN) based segmentation method for parking slot and lane markings based on the PSV dataset.
We evaluate our model on three major segmentation datasets: CamVid, PASCAL VOC and ADE20K.