no code implementations • 8 Jan 2025 • Boyang Sun, Hanzhi Chen, Stefan Leutenegger, Cesar Cadena, Marc Pollefeys, Hermann Blum
Exploration of unknown environments is crucial for autonomous robots; it allows them to actively reason and decide on what new data to acquire for tasks such as mapping, object discovery, and environmental assessment.
no code implementations • 16 Sep 2024 • Joshua Knights, Sebastián Barbas Laina, Peyman Moghadam, Stefan Leutenegger
This paper proposes SOLVR, a unified pipeline for learning based LiDAR-Visual re-localisation which performs place recognition and 6-DoF registration across sensor modalities.
1 code implementation • 22 Aug 2024 • Jiaxin Wei, Stefan Leutenegger
In this way, we simultaneously generate a compact 3D Gaussian map with fewer artifacts and a volumetric map on the fly.
no code implementations • 1 Aug 2024 • Philipp Schoch, Fan Yang, Yuntao Ma, Stefan Leutenegger, Marco Hutter, Quentin Leboutet
Current visual navigation systems often treat the environment as static, lacking the ability to adaptively interact with obstacles.
no code implementations • 17 Mar 2024 • Theresa Huber, Simon Schaefer, Stefan Leutenegger
The assumption of a static environment is common in many geometric computer vision tasks like SLAM but limits their applicability in highly dynamic scenes.
no code implementations • 8 Feb 2024 • Hanzhi Chen, Binbin Xu, Stefan Leutenegger
We present FuncGrasp, a framework that can infer dense yet reliable grasp configurations for unseen objects using one annotated object and single-view RGB-D observation via categorical priors.
no code implementations • 20 Dec 2023 • Jens Naumann, Binbin Xu, Stefan Leutenegger, Xingxing Zuo
We introduce a novel monocular visual odometry (VO) system, NeRF-VO, that integrates learning-based sparse visual odometry for low-latency camera tracking and a neural radiance scene representation for fine-detailed dense reconstruction and novel view synthesis.
no code implementations • CVPR 2024 • Hanfeng Wu, Xingxing Zuo, Stefan Leutenegger, Or Litany, Konrad Schindler, Shengyu Huang
We introduce DyNFL, a novel neural field-based approach for high-fidelity re-simulation of LiDAR scans in dynamic driving scenes.
no code implementations • 30 Nov 2023 • Daoyi Gao, Dávid Rozenberszki, Stefan Leutenegger, Angela Dai
We formulate this as a conditional generative task, leveraging diffusion to learn implicit probabilistic models capturing the shape, pose, and scale of CAD objects in an image.
no code implementations • 4 Nov 2023 • Diego Hidalgo-Carvajal, Hanzhi Chen, Gemma C. Bettelani, Jaesug Jung, Melissa Zavaglia, Laura Busse, Abdeldjallil Naceri, Stefan Leutenegger, Sami Haddadin
The progressive prevalence of robots in human-suited environments has given rise to a myriad of object manipulation techniques, in which dexterity plays a paramount role.
2 code implementations • 15 Oct 2023 • Jiaxin Wei, Stefan Leutenegger, Laurent Kneip
Perspective-$n$-Point (P$n$P) stands as a fundamental algorithm for pose estimation in various applications.
1 code implementation • 25 Sep 2023 • Christopher L. Choi, Binbin Xu, Stefan Leutenegger
Visual-Inertial (VI) sensors are popular in robotics, self-driving vehicles, and augmented and virtual reality applications.
no code implementations • 19 Sep 2023 • Simon Schaefer, Dorian F. Henning, Stefan Leutenegger
An accurate and uncertainty-aware 3D human body pose estimation is key to enabling truly safe but efficient human-robot interactions.
no code implementations • 3 Sep 2023 • Dorian F. Henning, Christopher Choi, Simon Schaefer, Stefan Leutenegger
Robust, fast, and accurate human state - 6D pose and posture - estimation remains a challenging problem.
no code implementations • 20 Jun 2023 • Anna Penzkofer, Simon Schaefer, Florian Strohm, Mihai Bâce, Stefan Leutenegger, Andreas Bulling
We show that intentions of human players, i. e. the precursor of goal-oriented decisions, can be robustly predicted from eye gaze even for the long-horizon sparse rewards task of Montezuma's Revenge - one of the most challenging RL tasks in the Atari2600 game suite.
Deep Reinforcement Learning
Hierarchical Reinforcement Learning
+3
no code implementations • 14 Jun 2023 • Yingye Xin, Xingxing Zuo, Dongyue Lu, Stefan Leutenegger
The sparse depth from VIO is firstly completed by a single-view depth completion network.
no code implementations • 24 May 2023 • Xingxing Zuo, Nan Yang, Nathaniel Merrill, Binbin Xu, Stefan Leutenegger
Incrementally recovering 3D dense structures from monocular videos is of paramount importance since it enables various robotics and AR applications.
no code implementations • 12 Oct 2022 • Yuxuan Xue, Haolong Li, Stefan Leutenegger, Jörg Stückler
Visual reconstruction of fast non-rigid object deformations over time is a challenge for conventional frame-based cameras.
no code implementations • 9 Aug 2022 • Binbin Xu, Andrew J. Davison, Stefan Leutenegger
In this paper, we propose a novel object-level mapping system that can simultaneously segment, track, and reconstruct objects in dynamic scenes.
1 code implementation • 8 Aug 2022 • Yifei Ren, Binbin Xu, Christopher L. Choi, Stefan Leutenegger
In this paper, we present a tightly-coupled visual-inertial object-level multi-instance dynamic SLAM system.
no code implementations • 27 Jul 2022 • Tristan Laidlow, Jan Czarnowski, Andrea Nicastro, Ronald Clark, Stefan Leutenegger
While systems that pass the output of traditional multi-view stereo approaches to a network for regularisation or refinement currently seem to get the best results, it may be preferable to treat deep neural networks as separate components whose results can be probabilistically fused into geometry-based systems.
no code implementations • 25 Jul 2022 • Tristan Laidlow, Jan Czarnowski, Stefan Leutenegger
While the keypoint-based maps created by sparse monocular simultaneous localisation and mapping (SLAM) systems are useful for camera tracking, dense 3D reconstructions may be desired for many robotic tasks.
no code implementations • 22 Jul 2022 • Tristan Laidlow, Michael Bloesch, Wenbin Li, Stefan Leutenegger
While dense visual SLAM methods are capable of estimating dense reconstructions of the environment, they suffer from a lack of robustness in their tracking step, especially when the optimisation is poorly initialised.
no code implementations • 4 May 2022 • Dorian F. Henning, Tristan Laidlow, Stefan Leutenegger
Through a series of experiments on video sequences of human motion captured by a moving monocular camera, we demonstrate that BodySLAM improves estimates of all human body parameters and camera poses when compared to estimating these separately.
no code implementations • 18 Feb 2022 • Stefan Leutenegger
Robust and accurate state estimation remains a challenge in robotics, Augmented, and Virtual Reality (AR/VR), even as Visual-Inertial Simultaneous Localisation and Mapping (VI-SLAM) getting commoditised.
no code implementations • ICCV 2021 • Zoe Landgraf, Raluca Scona, Tristan Laidlow, Stephen James, Stefan Leutenegger, Andrew J. Davison
At test time, our model can generate 3D shape and instance segmentation from a single depth view, probabilistically sampling proposals for the occluded region from the learned latent space.
no code implementations • ICCV 2021 • Shuaifeng Zhi, Tristan Laidlow, Stefan Leutenegger, Andrew J. Davison
Semantic labelling is highly correlated with geometry and radiance reconstruction, as scene entities with similar shape and appearance are more likely to come from similar classes.
1 code implementation • 31 Aug 2020 • Binbin Xu, Andrew J. Davison, Stefan Leutenegger
Dense image alignment from RGB-D images remains a critical issue for real-world applications, especially under challenging lighting conditions and in a wide baseline setting.
1 code implementation • CVPR 2020 • Joseph Ortiz, Mark Pupilli, Stefan Leutenegger, Andrew J. Davison
Graph processors such as Graphcore's Intelligence Processing Unit (IPU) are part of the major new wave of novel computer architecture for AI, and have a general design with massively parallel computation, distributed on-chip memory and very high inter-core communication bandwidth which allows breakthrough performance for message passing algorithms on arbitrary graphs.
no code implementations • 24 Feb 2020 • Zoe Landgraf, Fabian Falck, Michael Bloesch, Stefan Leutenegger, Andrew Davison
Generally capable Spatial AI systems must build persistent scene representations where geometric models are combined with meaningful semantic labels.
no code implementations • 18 Feb 2020 • Ujwal Bonde, Pablo F. Alcantarilla, Stefan Leutenegger
Our approach is distinct from previous works in panoptic segmentation that rely on a combination of a semantic segmentation network with a computationally costly instance segmentation network based on bounding box proposals, such as Mask R-CNN, to guide the prediction of instance labels using a Mixture-of-Expert (MoE) approach.
1 code implementation • 17 Apr 2019 • Guillermo Gallego, Tobi Delbruck, Garrick Orchard, Chiara Bartolozzi, Brian Taba, Andrea Censi, Stefan Leutenegger, Andrew Davison, Joerg Conradt, Kostas Daniilidis, Davide Scaramuzza
Event cameras offer attractive properties compared to traditional cameras: high temporal resolution (in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low power consumption, and high pixel bandwidth (on the order of kHz) resulting in reduced motion blur.
no code implementations • CVPR 2019 • Shuaifeng Zhi, Michael Bloesch, Stefan Leutenegger, Andrew J. Davison
Systems which incrementally create 3D semantic maps from image sequences must store and update representations of both geometry and semantic entities.
no code implementations • 3 Mar 2019 • Andrea Nicastro, Ronald Clark, Stefan Leutenegger
Detailed 3D reconstruction is an important challenge with application to robotics, augmented and virtual reality, which has seen impressive progress throughout the past years.
1 code implementation • 19 Dec 2018 • Binbin Xu, Wenbin Li, Dimos Tzoumanikas, Michael Bloesch, Andrew Davison, Stefan Leutenegger
It can provide robust camera tracking in dynamic environments and at the same time, continuously estimate geometric, semantic, and motion properties for arbitrary objects in the scene.
no code implementations • ECCV 2018 • Ronald Clark, Michael Bloesch, Jan Czarnowski, Stefan Leutenegger, Andrew J. Davison
In this paper, we propose LS-Net, a neural nonlinear least squares optimization algorithm which learns to effectively optimize these cost functions even in the presence of adversities.
no code implementations • 3 Sep 2018 • Wenbin Li, Sajad Saeedi, John McCormac, Ronald Clark, Dimos Tzoumanikas, Qing Ye, Yuzhong Huang, Rui Tang, Stefan Leutenegger
Datasets have gained an enormous amount of popularity in the computer vision community, from training and evaluation of Deep Learning-based methods to benchmarking Simultaneous Localization and Mapping (SLAM).
no code implementations • ECCV 2018 • Ronald Clark, Michael Bloesch, Jan Czarnowski, Stefan Leutenegger, Andrew J. Davison
In this paper, we propose a neural nonlinear least squares optimization algorithm which learns to effectively optimize these cost functions even in the presence of adversities.
no code implementations • 25 Aug 2018 • John McCormac, Ronald Clark, Michael Bloesch, Andrew J. Davison, Stefan Leutenegger
Reconstructed objects are stored in an optimisable 6DoF pose graph which is our only persistent map representation.
no code implementations • 27 Jul 2018 • Mickey Li, Noyan Songur, Pavel Orlov, Stefan Leutenegger, A. Aldo Faisal
Incorporating the physical environment is essential for a complete understanding of human behavior in unconstrained every-day tasks.
1 code implementation • CVPR 2018 • Michael Bloesch, Jan Czarnowski, Ronald Clark, Stefan Leutenegger, Andrew J. Davison
Our approach is suitable for use in a keyframe-based monocular dense SLAM system: While each keyframe with a code can produce a depth map, the code can be optimised efficiently jointly with pose variables and together with the codes of overlapping keyframes to attain global consistency.
3 code implementations • 3 Apr 2018 • Michael Bloesch, Jan Czarnowski, Ronald Clark, Stefan Leutenegger, Andrew J. Davison
Our approach is suitable for use in a keyframe-based monocular dense SLAM system: While each keyframe with a code can produce a depth map, the code can be optimised efficiently jointly with pose variables and together with the codes of overlapping keyframes to attain global consistency.
no code implementations • ICCV 2017 • John McCormac, Ankur Handa, Stefan Leutenegger, Andrew J. Davison
We compare the semantic segmentation performance of network weights produced from pre-training on RGB images from our dataset against generic VGG-16 ImageNet weights.
no code implementations • 29 Aug 2017 • Jan Czarnowski, Stefan Leutenegger, Andrew Davison
We argue that robust dense SLAM systems can make valuable use of the layers of features coming from a standard CNN as a pyramid of `semantic texture' which is suitable for dense alignment while being much more robust to nuisance factors such as lighting than raw RGB values.
1 code implementation • 15 Dec 2016 • John McCormac, Ankur Handa, Stefan Leutenegger, Andrew J. Davison
We introduce SceneNet RGB-D, expanding the previous work of SceneNet to enable large scale photorealistic rendering of indoor scene trajectories.
no code implementations • 16 Sep 2016 • John McCormac, Ankur Handa, Andrew Davison, Stefan Leutenegger
This not only produces a useful semantic 3D map, but we also show on the NYUv2 dataset that fusing multiple predictions leads to an improvement even in the 2D semantic labelling over baseline single frame predictions.
no code implementations • 7 Aug 2016 • Edward Johns, Stefan Leutenegger, Andrew J. Davison
With this, it is possible to achieve grasping robust to the gripper's pose uncertainty, by smoothing the grasp function with the pose uncertainty function.
no code implementations • CVPR 2016 • Patrick Bardow, Andrew J. Davison, Stefan Leutenegger
In a series of examples, we demonstrate the successful operation of our framework, including in situations where conventional cameras heavily suffer from dynamic range limitations or motion blur.
no code implementations • CVPR 2016 • Edward Johns, Stefan Leutenegger, Andrew J. Davison
A multi-view image sequence provides a much richer capacity for object recognition than from a single image.
no code implementations • 18 May 2015 • Michael Milford, Hanme Kim, Michael Mangan, Stefan Leutenegger, Tom Stone, Barbara Webb, Andrew Davison
Event-based cameras offer much potential to the fields of robotics and computer vision, in part due to their large dynamic range and extremely high "frame rates".