no code implementations • 12 May 2025 • Laszlo Szilagyi, Francis Engelmann, Jeannette Bohg
To address this, we introduce SLAG, a multi-GPU framework for language-augmented Gaussian splatting that enhances the speed and scalability of embedding large scenes.
no code implementations • 4 Apr 2025 • Kai Lascheit, Daniel Barath, Marc Pollefeys, Leonidas Guibas, Francis Engelmann
Additionally, we demonstrate that the fitted human mesh can refine body part labels, leading to improved segmentation.
no code implementations • 24 Mar 2025 • Chenyangguang Zhang, Alexandros Delitzas, Fangjinhua Wang, Ruida Zhang, Xiangyang Ji, Marc Pollefeys, Francis Engelmann
We introduce the task of predicting functional 3D scene graphs for real-world indoor environments from posed RGB-D images.
1 code implementation • 17 Oct 2024 • Guangda Ji, Silvan Weder, Francis Engelmann, Marc Pollefeys, Hermann Blum
Further, we push forward the state-of-the-art performance on ScanNet and ScanNet200 dataset with prevalent 3D semantic segmentation models, demonstrating the efficacy of our generated dataset.
Ranked #3 on
Semantic Segmentation
on ScanNet
(using extra training data)
no code implementations • 27 Sep 2024 • Ayca Takmaz, Alexandros Delitzas, Robert W. Sumner, Francis Engelmann, Johanna Wald, Federico Tombari
Existing methods for open-vocabulary 3D instance segmentation primarily focus on identifying object-level instances but struggle with finer-grained scene entities such as object parts, or regions described by generic attributes.
no code implementations • 29 Aug 2024 • Mathias Vogel, Keisuke Tateno, Marc Pollefeys, Federico Tombari, Marie-Julie Rakotosaona, Francis Engelmann
In this work, we tackle the task of point cloud denoising through a novel framework that adapts Diffusion Schr\"odinger bridges to points clouds.
no code implementations • 29 Jul 2024 • Yuanwen Yue, Anurag Das, Francis Engelmann, Siyu Tang, Jan Eric Lenssen
In this work, we show that fine-tuning on 3D-aware data improves the quality of emerging semantic features.
no code implementations • 30 May 2024 • Gonca Yilmaz, Songyou Peng, Marc Pollefeys, Francis Engelmann, Hermann Blum
However, this flexibility comes with a trade-off: fully-supervised closed-set methods still outperform OVS methods on base classes, that is on classes on which they have been explicitly trained.
Ranked #8 on
Open Vocabulary Semantic Segmentation
on ADE20K-150
3D Instance Segmentation
3D Open-Vocabulary Instance Segmentation
+8
no code implementations • 18 Apr 2024 • Oliver Lemke, Zuria Bauer, René Zurbrügg, Marc Pollefeys, Francis Engelmann, Hermann Blum
This allows for accurate detection directly in 3D scenes, object- and environment-aware grasp prediction, as well as robust and repeatable robotic manipulation.
no code implementations • 4 Apr 2024 • Francis Engelmann, Fabian Manhardt, Michael Niemeyer, Keisuke Tateno, Marc Pollefeys, Federico Tombari
Our OpenNeRF further leverages NeRF's ability to render novel views and extract open-set VLM features from areas that are not well observed in the initial posed images.
no code implementations • 30 Mar 2024 • Yang Miao, Francis Engelmann, Olga Vysotska, Federico Tombari, Marc Pollefeys, Dániel Béla Baráth
We introduce a novel problem, i. e., the localization of an input image within a multi-modal reference map represented by a database of 3D scene graphs.
no code implementations • 23 Feb 2024 • Francis Engelmann, Ayca Takmaz, Jonas Schult, Elisabetta Fedele, Johanna Wald, Songyou Peng, Xi Wang, Or Litany, Siyu Tang, Federico Tombari, Marc Pollefeys, Leonidas Guibas, Hongbo Tian, Chunjie Wang, Xiaosheng Yan, Bingwen Wang, Xuanyang Zhang, Xiao Liu, Phuc Nguyen, Khoi Nguyen, Anh Tran, Cuong Pham, Zhening Huang, Xiaoyang Wu, Xi Chen, Hengshuang Zhao, Lei Zhu, Joan Lasenby
This report provides an overview of the challenge hosted at the OpenSUN3D Workshop on Open-Vocabulary 3D Scene Understanding held in conjunction with ICCV 2023.
1 code implementation • 18 Jan 2024 • René Zurbrügg, Yifan Liu, Francis Engelmann, Suryansh Kumar, Marco Hutter, Vaishakh Patil, Fisher Yu
Executing a successful grasp in a cluttered environment requires multiple levels of scene understanding: First, the robot needs to analyze the geometric properties of individual objects to find feasible grasps.
no code implementations • CVPR 2024 • Alexandros Delitzas, Ayca Takmaz, Federico Tombari, Robert Sumner, Marc Pollefeys, Francis Engelmann
Existing 3D scene understanding methods are heavily focused on 3D semantic and instance segmentation.
no code implementations • 28 Dec 2023 • Rui Huang, Songyou Peng, Ayca Takmaz, Federico Tombari, Marc Pollefeys, Shiji Song, Gao Huang, Francis Engelmann
Therefore, we explore the use of image segmentation foundation models to automatically generate training labels for 3D segmentation.
no code implementations • 29 Nov 2023 • Silvan Weder, Francis Engelmann, Johannes L. Schönberger, Akihito Seki, Marc Pollefeys, Martin R. Oswald
Using these main contributions, our method can enable scenarios with real-time constraints and can scale to arbitrary scene sizes by processing and updating the scene only in a local region defined by the new measurement.
1 code implementation • 20 Nov 2023 • Silvan Weder, Hermann Blum, Francis Engelmann, Marc Pollefeys
Semantic annotations are indispensable to train or evaluate perception models, yet very costly to acquire.
Ranked #1 on
Semantic Segmentation
on Replica
1 code implementation • 1 Jun 2023 • Yuanwen Yue, Sabarinath Mahadevan, Jonas Schult, Francis Engelmann, Bastian Leibe, Konrad Schindler, Theodora Kontogianni
In an iterative process, the model assigns each data point to an object (or the background), while the user corrects errors in the resulting segmentation and feeds them back into the model.
no code implementations • ICCV 2023 • Ayça Takmaz, Jonas Schult, Irem Kaftan, Mertcan Akçay, Bastian Leibe, Robert Sumner, Francis Engelmann, Siyu Tang
We address this challenge and propose a framework for generating training data of synthetic humans interacting with real 3D scenes.
1 code implementation • CVPR 2023 • Yuanwen Yue, Theodora Kontogianni, Konrad Schindler, Francis Engelmann
Instead, we formulate floorplan reconstruction as a single-stage structured prediction task: find a variable-size set of polygons, which in turn are variable-length sequences of ordered vertices.
1 code implementation • 6 Oct 2022 • Jonas Schult, Francis Engelmann, Alexander Hermans, Or Litany, Siyu Tang, Bastian Leibe
Modern 3D semantic instance segmentation approaches predominantly rely on specialized voting mechanisms followed by carefully designed geometric clustering techniques.
Ranked #2 on
3D Instance Segmentation
on STPLS3D
3D Instance Segmentation
3D Semantic Instance Segmentation
+2
1 code implementation • 29 Sep 2022 • Lars Kreuzberg, Idil Esen Zulfikar, Sabarinath Mahadevan, Francis Engelmann, Bastian Leibe
Our voting-based tracklet generation method followed by geometric feature-based aggregation generates significantly improved panoptic LiDAR segmentation quality when compared to modeling the entire 4D volume using Gaussian probability distributions.
Ranked #4 on
4D Panoptic Segmentation
on SemanticKITTI
no code implementations • 2 Jun 2022 • Julian Chibane, Francis Engelmann, Tuan Anh Tran, Gerard Pons-Moll
Indeed, we show that it is possible to train dense segmentation models using only bounding box labels.
3D Instance Segmentation
3D Semantic Instance Segmentation
+2
4 code implementations • 5 Oct 2021 • Alexey Nekrasov, Jonas Schult, Or Litany, Bastian Leibe, Francis Engelmann
Since scene context helps reasoning about object semantics, current works focus on models with large capacity and receptive fields that can fully capture the global context of an input 3D scene.
Ranked #26 on
Semantic Segmentation
on ScanNet
no code implementations • CVPR 2021 • Francis Engelmann, Konstantinos Rematas, Bastian Leibe, Vittorio Ferrari
We propose a method to detect and reconstruct multiple 3D objects from a single RGB image.
1 code implementation • CVPR 2020 • Francis Engelmann, Martin Bokeloh, Alireza Fathi, Bastian Leibe, Matthias Niessner
We show that grouping proposals improves over NMS and outperforms previous state-of-the-art methods on the tasks of 3D object detection and semantic instance segmentation on the ScanNetV2 benchmark and the S3DIS dataset.
1 code implementation • 2 May 2020 • Francis Engelmann, Jörg Stückler, Bastian Leibe
In this paper, we propose to use 3D shape and motion priors to regularize the estimation of the trajectory and the shape of vehicles in sequences of stereo images.
1 code implementation • CVPR 2020 • Jonas Schult, Francis Engelmann, Theodora Kontogianni, Bastian Leibe
That is, the convolutional kernel weights are mapped to the local surface of a given mesh.
1 code implementation • 30 Mar 2020 • Francis Engelmann, Martin Bokeloh, Alireza Fathi, Bastian Leibe, Matthias Nießner
We show that grouping proposals improves over NMS and outperforms previous state-of-the-art methods on the tasks of 3D object detection and semantic instance segmentation on the ScanNetV2 benchmark and the S3DIS dataset.
Ranked #1 on
3D Semantic Instance Segmentation
on ScanNetV2
1 code implementation • 28 Jul 2019 • Francis Engelmann, Theodora Kontogianni, Bastian Leibe
In a thorough ablation study, we show that the receptive field size is directly related to the performance of 3D point cloud processing tasks, including semantic segmentation and object classification.
Ranked #51 on
Semantic Segmentation
on S3DIS Area5
no code implementations • 3 Apr 2019 • Cathrin Elich, Francis Engelmann, Theodora Kontogianni, Bastian Leibe
A lot of progress was made in the field of object classification and semantic segmentation.
Ranked #4 on
3D Semantic Instance Segmentation
on ScanNetV2
3D Instance Segmentation
3D Semantic Instance Segmentation
+5
no code implementations • 2 Oct 2018 • Francis Engelmann, Theodora Kontogianni, Jonas Schult, Bastian Leibe
In this paper, we present a deep learning architecture which addresses the problem of 3D semantic segmentation of unstructured point clouds.
1 code implementation • 5 Feb 2018 • Francis Engelmann, Theodora Kontogianni, Alexander Hermans, Bastian Leibe
The recently proposed PointNet architecture presents an interesting step ahead in that it can operate on unstructured point clouds, achieving encouraging segmentation results.
no code implementations • 7 Feb 2017 • Anton Kasyanov, Francis Engelmann, Jörg Stückler, Bastian Leibe
Our visual-inertial SLAM system is based on a real-time capable visual-inertial odometry method that provides locally consistent trajectory and map estimates.