Part-aware panoptic segmentation is a problem of computer vision that aims to provide a semantic understanding of the scene at multiple levels of granularity.
For CPP, the PartPQ of our proposed model with joint fusion surpasses the previous state-of-the-art by 1. 6 and 4. 7 percentage points for all areas and segments with parts, respectively.
Humans understand this concept at young ages and know that another person is still there, even though it is temporarily occluded.
In class-incremental semantic segmentation (CISS), deep learning architectures suffer from the critical problems of catastrophic forgetting and semantic background shift.
Ranked #1 on Overlapped 14-1 on Cityscapes
In recent years, deep neural networks showed their exceeding capabilities in addressing many computer vision tasks including scene flow prediction.
The proposed RMS-FlowNet is a novel end-to-end learning-based architecture for accurate and efficient scene flow estimation which can operate on point clouds of high density.
This paper presents an iterative multi-scale coarse-to-fine refinement (iCFR) framework to bridge this gap by allowing it to adopt any stereo matching network to make it fast, more efficient and scalable while keeping comparable accuracy.
Contrary to the ongoing trend in automotive applications towards usage of more diverse and more sensors, this work tries to solve the complex scene flow problem under a monocular camera setup, i. e. using a single sensor.
In-the-wild human pose estimation has a huge potential for various fields, ranging from animation and action recognition to intention recognition and prediction for autonomous driving.
Interpolation of sparse pixel information towards a dense target resolution finds its application across multiple disciplines in computer vision.
In this paper, we present DeepLiDARFlow, a novel deep learning architecture which fuses high level RGB and LiDAR features at multiple scales in a monocular setup to predict dense scene flow.
Thus, we present ResFPN -- a multi-resolution feature pyramid network with multiple residual skip connections, where at any scale, we leverage the information from higher resolution maps for stronger and better localized features.
We propose a new approach called LiDAR-Flow to robustly estimate a dense scene flow by fusing a sparse LiDAR with stereo images.
In the last few years, convolutional neural networks (CNNs) have demonstrated increasing success at learning many computer vision tasks including dense estimation problems such as optical flow and stereo matching.
Ranked #2 on Scene Flow Estimation on KITTI 2015 Scene Flow Test
Not only the tuning of hyperparameters, but also the gathering and selection of training data, the design of the loss function, and the construction of training schedules is important to get the most out of a model.
Our network has a very large receptive field and avoids striding layers to maintain spatial resolution.
State-of-the-art scene flow algorithms pursue the conflicting targets of accuracy, run time, and robustness.
Scene flow describes the 3D position as well as the 3D motion of each pixel in an image.
Thus, we propose in this paper FlowFields++, where we combine the accurate matches of Flow Fields with a robust interpolation.
Scene flow is a description of real world motion in 3D that contains more information than optical flow.
While most scene flow methods use either variational optimization or a strong rigid motion assumption, we show for the first time that scene flow can also be estimated by dense interpolation of sparse matches.