Search Results for author: Stefan Leutenegger

Found 46 papers, 10 papers with code

DynamicGlue: Epipolar and Time-Informed Data Association in Dynamic Environments using Graph Neural Networks

no code implementations17 Mar 2024 Theresa Huber, Simon Schaefer, Stefan Leutenegger

The assumption of a static environment is common in many geometric computer vision tasks like SLAM but limits their applicability in highly dynamic scenes.

FuncGrasp: Learning Object-Centric Neural Grasp Functions from Single Annotated Example Object

no code implementations8 Feb 2024 Hanzhi Chen, Binbin Xu, Stefan Leutenegger

We present FuncGrasp, a framework that can infer dense yet reliable grasp configurations for unseen objects using one annotated object and single-view RGB-D observation via categorical priors.

Object

NeRF-VO: Real-Time Sparse Visual Odometry with Neural Radiance Fields

no code implementations20 Dec 2023 Jens Naumann, Binbin Xu, Stefan Leutenegger, Xingxing Zuo

We introduce a novel monocular visual odometry (VO) system, NeRF-VO, that integrates learning-based sparse visual odometry for low-latency camera tracking and a neural radiance scene representation for sophisticated dense reconstruction and novel view synthesis.

Depth Estimation Depth Prediction +3

Dynamic LiDAR Re-simulation using Compositional Neural Fields

no code implementations8 Dec 2023 Hanfeng Wu, Xingxing Zuo, Stefan Leutenegger, Or Litany, Konrad Schindler, Shengyu Huang

We introduce DyNFL, a novel neural field-based approach for high-fidelity re-simulation of LiDAR scans in dynamic driving scenes.

DiffCAD: Weakly-Supervised Probabilistic CAD Model Retrieval and Alignment from an RGB Image

no code implementations30 Nov 2023 Daoyi Gao, Dávid Rozenberszki, Stefan Leutenegger, Angela Dai

We formulate this as a conditional generative task, leveraging diffusion to learn implicit probabilistic models capturing the shape, pose, and scale of CAD objects in an image.

Retrieval

Anthropomorphic Grasping with Neural Object Shape Completion

no code implementations4 Nov 2023 Diego Hidalgo-Carvajal, Hanzhi Chen, Gemma C. Bettelani, Jaesug Jung, Melissa Zavaglia, Laura Busse, Abdeldjallil Naceri, Stefan Leutenegger, Sami Haddadin

The progressive prevalence of robots in human-suited environments has given rise to a myriad of object manipulation techniques, in which dexterity plays a paramount role.

Object

AP$n$P: A Less-constrained P$n$P Solver for Pose Estimation with Unknown Anisotropic Scaling or Focal Lengths

1 code implementation15 Oct 2023 Jiaxin Wei, Stefan Leutenegger, Laurent Kneip

Experimental results on both simulated and real datasets demonstrate the effectiveness of AP$n$P as a more flexible and practical solution to camera pose estimation.

Pose Estimation

Accurate and Interactive Visual-Inertial Sensor Calibration with Next-Best-View and Next-Best-Trajectory Suggestion

1 code implementation25 Sep 2023 Christopher L. Choi, Binbin Xu, Stefan Leutenegger

Visual-Inertial (VI) sensors are popular in robotics, self-driving vehicles, and augmented and virtual reality applications.

GloPro: Globally-Consistent Uncertainty-Aware 3D Human Pose Estimation & Tracking in the Wild

no code implementations19 Sep 2023 Simon Schaefer, Dorian F. Henning, Stefan Leutenegger

An accurate and uncertainty-aware 3D human body pose estimation is key to enabling truly safe but efficient human-robot interactions.

3D Human Pose Estimation

BodySLAM++: Fast and Tightly-Coupled Visual-Inertial Camera and Human Motion Tracking

no code implementations3 Sep 2023 Dorian F. Henning, Christopher Choi, Simon Schaefer, Stefan Leutenegger

Robust, fast, and accurate human state - 6D pose and posture - estimation remains a challenging problem.

Int-HRL: Towards Intention-based Hierarchical Reinforcement Learning

no code implementations20 Jun 2023 Anna Penzkofer, Simon Schaefer, Florian Strohm, Mihai Bâce, Stefan Leutenegger, Andreas Bulling

We show that intentions of human players, i. e. the precursor of goal-oriented decisions, can be robustly predicted from eye gaze even for the long-horizon sparse rewards task of Montezuma's Revenge - one of the most challenging RL tasks in the Atari2600 game suite.

Hierarchical Reinforcement Learning Montezuma's Revenge +2

Incremental Dense Reconstruction from Monocular Video with Guided Sparse Feature Volume Fusion

no code implementations24 May 2023 Xingxing Zuo, Nan Yang, Nathaniel Merrill, Binbin Xu, Stefan Leutenegger

Incrementally recovering 3D dense structures from monocular videos is of paramount importance since it enables various robotics and AR applications.

Event-based Non-Rigid Reconstruction from Contours

no code implementations12 Oct 2022 Yuxuan Xue, Haolong Li, Stefan Leutenegger, Jörg Stückler

Visual reconstruction of fast non-rigid object deformations over time is a challenge for conventional frame-based cameras.

Learning to Complete Object Shapes for Object-level Mapping in Dynamic Scenes

no code implementations9 Aug 2022 Binbin Xu, Andrew J. Davison, Stefan Leutenegger

In this paper, we propose a novel object-level mapping system that can simultaneously segment, track, and reconstruct objects in dynamic scenes.

Instance Segmentation Object +2

Visual-Inertial Multi-Instance Dynamic SLAM with Object-level Relocalisation

1 code implementation8 Aug 2022 Yifei Ren, Binbin Xu, Christopher L. Choi, Stefan Leutenegger

In this paper, we present a tightly-coupled visual-inertial object-level multi-instance dynamic SLAM system.

3D Reconstruction Object +1

Towards the Probabilistic Fusion of Learned Priors into Standard Pipelines for 3D Reconstruction

no code implementations27 Jul 2022 Tristan Laidlow, Jan Czarnowski, Andrea Nicastro, Ronald Clark, Stefan Leutenegger

While systems that pass the output of traditional multi-view stereo approaches to a network for regularisation or refinement currently seem to get the best results, it may be preferable to treat deep neural networks as separate components whose results can be probabilistically fused into geometry-based systems.

3D Reconstruction

DeepFusion: Real-Time Dense 3D Reconstruction for Monocular SLAM using Single-View Depth and Gradient Predictions

no code implementations25 Jul 2022 Tristan Laidlow, Jan Czarnowski, Stefan Leutenegger

While the keypoint-based maps created by sparse monocular simultaneous localisation and mapping (SLAM) systems are useful for camera tracking, dense 3D reconstructions may be desired for many robotic tasks.

3D Reconstruction

Dense RGB-D-Inertial SLAM with Map Deformations

no code implementations22 Jul 2022 Tristan Laidlow, Michael Bloesch, Wenbin Li, Stefan Leutenegger

While dense visual SLAM methods are capable of estimating dense reconstructions of the environment, they suffer from a lack of robustness in their tracking step, especially when the optimisation is poorly initialised.

3D Reconstruction

BodySLAM: Joint Camera Localisation, Mapping, and Human Motion Tracking

no code implementations4 May 2022 Dorian F. Henning, Tristan Laidlow, Stefan Leutenegger

Through a series of experiments on video sequences of human motion captured by a moving monocular camera, we demonstrate that BodySLAM improves estimates of all human body parameters and camera poses when compared to estimating these separately.

OKVIS2: Realtime Scalable Visual-Inertial SLAM with Loop Closure

no code implementations18 Feb 2022 Stefan Leutenegger

Robust and accurate state estimation remains a challenge in robotics, Augmented, and Virtual Reality (AR/VR), even as Visual-Inertial Simultaneous Localisation and Mapping (VI-SLAM) getting commoditised.

SIMstack: A Generative Shape and Instance Model for Unordered Object Stacks

no code implementations ICCV 2021 Zoe Landgraf, Raluca Scona, Tristan Laidlow, Stephen James, Stefan Leutenegger, Andrew J. Davison

At test time, our model can generate 3D shape and instance segmentation from a single depth view, probabilistically sampling proposals for the occluded region from the learned latent space.

Instance Segmentation Segmentation +2

In-Place Scene Labelling and Understanding with Implicit Scene Representation

no code implementations ICCV 2021 Shuaifeng Zhi, Tristan Laidlow, Stefan Leutenegger, Andrew J. Davison

Semantic labelling is highly correlated with geometry and radiance reconstruction, as scene entities with similar shape and appearance are more likely to come from similar classes.

Denoising Super-Resolution

Deep Probabilistic Feature-metric Tracking

1 code implementation31 Aug 2020 Binbin Xu, Andrew J. Davison, Stefan Leutenegger

Dense image alignment from RGB-D images remains a critical issue for real-world applications, especially under challenging lighting conditions and in a wide baseline setting.

Object Tracking

Bundle Adjustment on a Graph Processor

1 code implementation CVPR 2020 Joseph Ortiz, Mark Pupilli, Stefan Leutenegger, Andrew J. Davison

Graph processors such as Graphcore's Intelligence Processing Unit (IPU) are part of the major new wave of novel computer architecture for AI, and have a general design with massively parallel computation, distributed on-chip memory and very high inter-core communication bandwidth which allows breakthrough performance for message passing algorithms on arbitrary graphs.

Comparing View-Based and Map-Based Semantic Labelling in Real-Time SLAM

no code implementations24 Feb 2020 Zoe Landgraf, Fabian Falck, Michael Bloesch, Stefan Leutenegger, Andrew Davison

Generally capable Spatial AI systems must build persistent scene representations where geometric models are combined with meaningful semantic labels.

Towards Bounding-Box Free Panoptic Segmentation

no code implementations18 Feb 2020 Ujwal Bonde, Pablo F. Alcantarilla, Stefan Leutenegger

Our approach is distinct from previous works in panoptic segmentation that rely on a combination of a semantic segmentation network with a computationally costly instance segmentation network based on bounding box proposals, such as Mask R-CNN, to guide the prediction of instance labels using a Mixture-of-Expert (MoE) approach.

Instance Segmentation Panoptic Segmentation +1

Event-based Vision: A Survey

1 code implementation17 Apr 2019 Guillermo Gallego, Tobi Delbruck, Garrick Orchard, Chiara Bartolozzi, Brian Taba, Andrea Censi, Stefan Leutenegger, Andrew Davison, Joerg Conradt, Kostas Daniilidis, Davide Scaramuzza

Event cameras offer attractive properties compared to traditional cameras: high temporal resolution (in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low power consumption, and high pixel bandwidth (on the order of kHz) resulting in reduced motion blur.

Event-based vision

SceneCode: Monocular Dense Semantic Reconstruction using Learned Encoded Scene Representations

no code implementations CVPR 2019 Shuaifeng Zhi, Michael Bloesch, Stefan Leutenegger, Andrew J. Davison

Systems which incrementally create 3D semantic maps from image sequences must store and update representations of both geometry and semantic entities.

X-Section: Cross-Section Prediction for Enhanced RGBD Fusion

no code implementations3 Mar 2019 Andrea Nicastro, Ronald Clark, Stefan Leutenegger

Detailed 3D reconstruction is an important challenge with application to robotics, augmented and virtual reality, which has seen impressive progress throughout the past years.

3D Reconstruction Object

MID-Fusion: Octree-based Object-Level Multi-Instance Dynamic SLAM

1 code implementation19 Dec 2018 Binbin Xu, Wenbin Li, Dimos Tzoumanikas, Michael Bloesch, Andrew Davison, Stefan Leutenegger

It can provide robust camera tracking in dynamic environments and at the same time, continuously estimate geometric, semantic, and motion properties for arbitrary objects in the scene.

Instance Segmentation Object +3

LS-Net: Learning to Solve Nonlinear Least Squares for Monocular Stereo

no code implementations ECCV 2018 Ronald Clark, Michael Bloesch, Jan Czarnowski, Stefan Leutenegger, Andrew J. Davison

In this paper, we propose LS-Net, a neural nonlinear least squares optimization algorithm which learns to effectively optimize these cost functions even in the presence of adversities.

InteriorNet: Mega-scale Multi-sensor Photo-realistic Indoor Scenes Dataset

no code implementations3 Sep 2018 Wenbin Li, Sajad Saeedi, John McCormac, Ronald Clark, Dimos Tzoumanikas, Qing Ye, Yuzhong Huang, Rui Tang, Stefan Leutenegger

Datasets have gained an enormous amount of popularity in the computer vision community, from training and evaluation of Deep Learning-based methods to benchmarking Simultaneous Localization and Mapping (SLAM).

Benchmarking Simultaneous Localization and Mapping

Learning to Solve Nonlinear Least Squares for Monocular Stereo

no code implementations ECCV 2018 Ronald Clark, Michael Bloesch, Jan Czarnowski, Stefan Leutenegger, Andrew J. Davison

In this paper, we propose a neural nonlinear least squares optimization algorithm which learns to effectively optimize these cost functions even in the presence of adversities.

Fusion++: Volumetric Object-Level SLAM

no code implementations25 Aug 2018 John McCormac, Ronald Clark, Michael Bloesch, Andrew J. Davison, Stefan Leutenegger

Reconstructed objects are stored in an optimisable 6DoF pose graph which is our only persistent map representation.

Loop Closure Detection Object

CodeSLAM — Learning a Compact, Optimisable Representation for Dense Visual SLAM

1 code implementation CVPR 2018 Michael Bloesch, Jan Czarnowski, Ronald Clark, Stefan Leutenegger, Andrew J. Davison

Our approach is suitable for use in a keyframe-based monocular dense SLAM system: While each keyframe with a code can produce a depth map, the code can be optimised efficiently jointly with pose variables and together with the codes of overlapping keyframes to attain global consistency.

CodeSLAM - Learning a Compact, Optimisable Representation for Dense Visual SLAM

3 code implementations3 Apr 2018 Michael Bloesch, Jan Czarnowski, Ronald Clark, Stefan Leutenegger, Andrew J. Davison

Our approach is suitable for use in a keyframe-based monocular dense SLAM system: While each keyframe with a code can produce a depth map, the code can be optimised efficiently jointly with pose variables and together with the codes of overlapping keyframes to attain global consistency.

SceneNet RGB-D: Can 5M Synthetic Images Beat Generic ImageNet Pre-Training on Indoor Segmentation?

no code implementations ICCV 2017 John McCormac, Ankur Handa, Stefan Leutenegger, Andrew J. Davison

We compare the semantic segmentation performance of network weights produced from pre-training on RGB images from our dataset against generic VGG-16 ImageNet weights.

16k Instance Segmentation +7

Semantic Texture for Robust Dense Tracking

no code implementations29 Aug 2017 Jan Czarnowski, Stefan Leutenegger, Andrew Davison

We argue that robust dense SLAM systems can make valuable use of the layers of features coming from a standard CNN as a pyramid of `semantic texture' which is suitable for dense alignment while being much more robust to nuisance factors such as lighting than raw RGB values.

SceneNet RGB-D: 5M Photorealistic Images of Synthetic Indoor Trajectories with Ground Truth

1 code implementation15 Dec 2016 John McCormac, Ankur Handa, Stefan Leutenegger, Andrew J. Davison

We introduce SceneNet RGB-D, expanding the previous work of SceneNet to enable large scale photorealistic rendering of indoor scene trajectories.

3D Reconstruction Depth Estimation +7

SemanticFusion: Dense 3D Semantic Mapping with Convolutional Neural Networks

no code implementations16 Sep 2016 John McCormac, Ankur Handa, Andrew Davison, Stefan Leutenegger

This not only produces a useful semantic 3D map, but we also show on the NYUv2 dataset that fusing multiple predictions leads to an improvement even in the 2D semantic labelling over baseline single frame predictions.

Deep Learning a Grasp Function for Grasping under Gripper Pose Uncertainty

no code implementations7 Aug 2016 Edward Johns, Stefan Leutenegger, Andrew J. Davison

With this, it is possible to achieve grasping robust to the gripper's pose uncertainty, by smoothing the grasp function with the pose uncertainty function.

Simultaneous Optical Flow and Intensity Estimation From an Event Camera

no code implementations CVPR 2016 Patrick Bardow, Andrew J. Davison, Stefan Leutenegger

In a series of examples, we demonstrate the successful operation of our framework, including in situations where conventional cameras heavily suffer from dynamic range limitations or motion blur.

Optical Flow Estimation

Place Recognition with Event-based Cameras and a Neural Implementation of SeqSLAM

no code implementations18 May 2015 Michael Milford, Hanme Kim, Michael Mangan, Stefan Leutenegger, Tom Stone, Barbara Webb, Andrew Davison

Event-based cameras offer much potential to the fields of robotics and computer vision, in part due to their large dynamic range and extremely high "frame rates".

Cannot find the paper you are looking for? You can Submit a new open access paper.