Reliable feature correspondence between frames is a critical step in visual odometry (VO) and visual simultaneous localization and mapping (V-SLAM) algorithms.
We propose a novel end-to-end deep neural network that generates dynamic upsampling filters and a residual image, which are computed depending on the local spatio-temporal neighborhood of each pixel to avoid explicit motion compensation.
We reconstruct a set of non-linear factors that make an optimal approximation of the information on the trajectory accumulated by VIO.
We evaluate the model using a calibration dataset with several different lenses and compare the models using the metrics that are relevant for Visual Odometry, i. e., reprojection error, as well as computation time for projection and unprojection functions and their Jacobians.
In this paper, we provide a modern synthesis of the classic inverse compositional algorithm for dense image alignment.
Then the whole scene is decomposed into moving foreground and static background by compar- ing the estimated optical flow and rigid flow derived from the depth and ego-motion.
New vision sensors, such as the Dynamic and Active-pixel Vision sensor (DAVIS), incorporate a conventional global-shutter camera and an event-based sensor in the same pixel array.
In this paper, we propose a state-of-the-art video denoising algorithm based on a convolutional neural network architecture.