Recent work on hyperparameters optimization (HPO) has shown the possibility of training certain hyperparameters together with regular parameters.
Motivated by this ability, we present a new self-supervised learning representation framework that can be directly deployed on a video stream of complex scenes with many moving objects.
On two large-scale real-world datasets, nuScenes and ATG4D, we showcase that our scene-occupancy predictions are more accurate and better calibrated than those from state-of-the-art motion forecasting methods, while also matching their performance in pedestrian motion forecasting metrics.
3D shape completion for real data is important but challenging, since partial point clouds acquired by real-world sensors are usually sparse, noisy and unaligned.
Deep neural nets typically perform end-to-end backpropagation to learn the weights, a procedure that creates synchronization constraints in the weight update step across layers and is not biologically plausible.
Obtaining precise instance segmentation masks is of high importance in many modern applications such as robotic manipulation and autonomous driving.
In this paper, we propose PolyTransform, a novel instance segmentation algorithm that produces precise, geometry-preserving masks by combining the strengths of prevailing segmentation approaches and modern polygon-based methods.
Ranked #1 on Instance Segmentation on Cityscapes test (using extra training data)
Particularly difficult is the prediction of human behavior.
Recent studies on catastrophic forgetting during sequential learning typically focus on fixing the accuracy of the predictions for a previously learned task.
In practice, it performs similarly to the Hungarian algorithm during inference.
More importantly, we introduce a parameter-free panoptic head which solves the panoptic segmentation via pixel-wise classification.
Ranked #3 on Panoptic Segmentation on KITTI Panoptic Segmentation
Message-passing algorithms, such as belief propagation, are a natural way to disseminate evidence amongst correlated variables while exploiting the graph structure, but these algorithms can struggle when the conditional dependency graphs contain loops.
We examine all RBP variants along with BPTT and TBPTT in three different application domains: associative memory with continuous Hopfield networks, document classification in citation networks using graph neural networks and hyperparameter optimization for fully connected networks.
Convolutional neural networks (CNNs) are inherently limited to model geometric transformations due to the fixed geometric structures in its building modules.
Ranked #182 on Object Detection on COCO test-dev
Yet, it is non-trivial to transfer the state-of-the-art image recognition networks to videos as per-frame evaluation is too slow and unaffordable.
Ranked #8 on Video Semantic Segmentation on Cityscapes val