In this paper, we suggest an approach towards integrating planning with sequence models based on the idea of iterative energy minimization, and illustrate how such a procedure leads to improved RL performance across different tasks.
The OOD score is then determined by combining the deviation from the input data to the ID pattern in both subspaces.
We propose a new 6-DoF grasp pose synthesis approach from 2D/2. 5D input based on keypoints.
We present a parallelized optimization method based on fast Neural Radiance Fields (NeRF) for estimating 6-DoF pose of a camera with respect to an object or scene.
This manuscript describes the specific geometry of the neural network used for zeroing barrier function synthesis, and shows how the network provides the necessary representation for splitting the state space into safe and unsafe regions.
We propose a single-stage, category-level 6-DoF pose estimation algorithm that simultaneously detects and tracks instances of objects within a known category.
Prior work on 6-DoF object pose estimation has largely focused on instance-level processing, in which a textured CAD model is available for each object being detected.
The cost-efficiency of visual(-inertial) SLAM (VSLAM) is a critical characteristic of resource-limited applications.
Visual-inertial SLAM is essential for robot navigation in GPS-denied environments, e. g. indoor, underground.
Each primitive shape is designed with parametrized grasp families, permitting the pipeline to identify multiple grasp candidates per shape primitive region.
Unfortunately, the top performing affordance recognition methods use object category priors to boost the accuracy of affordance detection and segmentation.
As a first pass in this direction, we equip a wireless, monocular color camera to the head of a robotic snake.
This paper describes an enhancement to co-visibility local map building by incorporating a strong appearance prior, which leads to a more compact local map and latency reduction in downstream data association.
The ad-hoc creation of these benchmarks does not necessarily illuminate the particular weak points of a SLAM algorithm when performance is evaluated.
This paper tackles a problem in line-assisted VO/VSLAM: accurately solving the least squares pose optimization with unreliable 3D line input.
By defining the learning problem to be classification with null hypothesis competition instead of regression, the deep neural network with RGB-D image input predicts multiple grasp candidates for a single object or multiple objects, in a single shot.
A codeword for bag-of-words models is generated by packaging the learned descriptor and mask, with a masked Hamming distance defined to measure the distance between two codewords.
This paper presents a practical, and theoretically well-founded, approach to improve the speed of kernel manifold learning algorithms relying on spectral decomposition.
Efficient computation strategies for the observability indices are described based on incremental singular value decomposition (SVD) and greedy selection for the temporal and instantaneous observability indices, respectively.