Search Results for author: Benjamin Busam

Found 69 papers, 18 papers with code

EchoScene: Indoor Scene Generation via Information Echo over Scene Graph Diffusion

1 code implementation2 May 2024 Guangyao Zhai, Evin Pınar Örnek, Dave Zhenyu Chen, Ruotong Liao, Yan Di, Nassir Navab, Federico Tombari, Benjamin Busam

The scheme ensures that the denoising processes are influenced by a holistic understanding of the scene graph, facilitating the generation of globally coherent scenes.

3D Object Retrieval Denoising +2

FLex: Joint Pose and Dynamic Radiance Fields Optimization for Stereo Endoscopic Videos

no code implementations18 Mar 2024 Florian Philipp Stilz, Mert Asim Karaoglu, Felix Tristram, Nassir Navab, Benjamin Busam, Alexander Ladikos

However, the setup has been restricted to a static endoscope, limited deformation, or required an external tracking device to retrieve camera pose information of the endoscopic camera.

Neural Rendering Novel View Synthesis

Deformable 3D Gaussian Splatting for Animatable Human Avatars

no code implementations22 Dec 2023 HyunJun Jung, Nikolas Brasch, Jifei Song, Eduardo Perez-Pellitero, Yiren Zhou, Zhihao LI, Nassir Navab, Benjamin Busam

ParDy-Human introduces parameter-driven dynamics into 3D Gaussian Splatting where 3D Gaussians are deformed by a human pose model to animate the avatar.

Novel View Synthesis

Reality's Canvas, Language's Brush: Crafting 3D Avatars from Monocular Video

no code implementations8 Dec 2023 Yuchen Rao, Eduardo Perez Pellitero, Benjamin Busam, Yiren Zhou, Jifei Song

A pose-conditioned deformable NeRF is optimized to volumetrically represent a human subject in canonical T-pose.

Image Generation

S2P3: Self-Supervised Polarimetric Pose Prediction

no code implementations2 Dec 2023 Patrick Ruhkamp, Daoyi Gao, Nassir Navab, Benjamin Busam

The novel training paradigm comprises 1) a physical model to extract geometric information of polarized light, 2) a teacher-student knowledge distillation scheme and 3) a self-supervised loss formulation through differentiable rendering and an invertible physical constraint.

Knowledge Distillation Pose Prediction

RaDialog: A Large Vision-Language Model for Radiology Report Generation and Conversational Assistance

1 code implementation30 Nov 2023 Chantal Pellegrini, Ege Özsoy, Benjamin Busam, Nassir Navab, Matthias Keicher

Conversational AI tools that can generate and discuss clinically correct radiology reports for a given medical image have the potential to transform radiology.

Language Modelling Large Language Model

SecondPose: SE(3)-Consistent Dual-Stream Feature Fusion for Category-Level Pose Estimation

1 code implementation CVPR 2024 Yamei Chen, Yan Di, Guangyao Zhai, Fabian Manhardt, Chenyangguang Zhang, Ruida Zhang, Federico Tombari, Nassir Navab, Benjamin Busam

Leveraging the advantage of DINOv2 in providing SE(3)-consistent semantic features, we hierarchically extract two types of SE(3)-invariant geometric features to further encapsulate local-to-global object-specific information.

Object Pose Estimation

SG-Bot: Object Rearrangement via Coarse-to-Fine Robotic Imagination on Scene Graphs

no code implementations21 Sep 2023 Guangyao Zhai, Xiaoni Cai, Dianye Huang, Yan Di, Fabian Manhardt, Federico Tombari, Nassir Navab, Benjamin Busam

In this paper, we present SG-Bot, a novel rearrangement framework that utilizes a coarse-to-fine scheme with a scene graph as the scene representation.

RIDE: Self-Supervised Learning of Rotation-Equivariant Keypoint Detection and Invariant Description for Endoscopy

no code implementations18 Sep 2023 Mert Asim Karaoglu, Viktoria Markova, Nassir Navab, Benjamin Busam, Alexander Ladikos

While most classical methods achieve rotation-equivariant detection and invariant description by design, many learning-based approaches learn to be robust only up to a certain degree.

Keypoint Detection Self-Supervised Learning

On the Localization of Ultrasound Image Slices within Point Distribution Models

1 code implementation1 Sep 2023 Lennart Bastian, Vincent Bürgin, Ha Young Kim, Alexander Baumann, Benjamin Busam, Mahdi Saleh, Nassir Navab

We demonstrate that our multi-modal registration framework can localize images on the 3D surface topology of a patient-specific organ and the mean shape of an SSM.

3D Reconstruction 3D Shape Representation +2

3D Adversarial Augmentations for Robust Out-of-Domain Predictions

no code implementations29 Aug 2023 Alexander Lehner, Stefano Gasperini, Alvaro Marcos-Ramiro, Michael Schmidt, Nassir Navab, Benjamin Busam, Federico Tombari

We conduct extensive experiments across a variety of scenarios on data from KITTI, Waymo, and CrashD for 3D object detection, and on data from SemanticKITTI, Waymo, and nuScenes for 3D semantic segmentation.

3D Object Detection 3D Semantic Segmentation +2

Polarimetric Information for Multi-Modal 6D Pose Estimation of Photometrically Challenging Objects with Limited Data

no code implementations21 Aug 2023 Patrick Ruhkamp, Daoyi Gao, HyunJun Jung, Nassir Navab, Benjamin Busam

6D pose estimation pipelines that rely on RGB-only or RGB-D data show limitations for photometrically challenging objects with e. g. textureless surfaces, reflections or transparency.

6D Pose Estimation

Multi-Modal Dataset Acquisition for Photometrically Challenging Object

no code implementations21 Aug 2023 HyunJun Jung, Patrick Ruhkamp, Nassir Navab, Benjamin Busam

This paper addresses the limitations of current datasets for 3D vision tasks in terms of accuracy, size, realism, and suitable imaging modalities for photometrically challenging objects.


CCD-3DR: Consistent Conditioning in Diffusion for Single-Image 3D Reconstruction

no code implementations15 Aug 2023 Yan Di, Chenyangguang Zhang, Pengyuan Wang, Guangyao Zhai, Ruida Zhang, Fabian Manhardt, Benjamin Busam, Xiangyang Ji, Federico Tombari

However, such strategies fail to consistently align the denoised point cloud with the given image, leading to unstable conditioning and inferior performance.

3D Reconstruction

DisguisOR: Holistic Face Anonymization for the Operating Room

1 code implementation26 Jul 2023 Lennart Bastian, Tony Danjun Wang, Tobias Czempiel, Benjamin Busam, Nassir Navab

Methods: RGB and depth images from multiple cameras are fused into a 3D point cloud representation of the scene.

Face Anonymization

S3M: Scalable Statistical Shape Modeling through Unsupervised Correspondences

1 code implementation15 Apr 2023 Lennart Bastian, Alexander Baumann, Emily Hoppe, Vincent Bürgin, Ha Young Kim, Mahdi Saleh, Benjamin Busam, Nassir Navab

Statistical shape models (SSMs) are an established way to represent the anatomy of a population with various clinically relevant applications.


Location-Free Scene Graph Generation

no code implementations20 Mar 2023 Ege Özsoy, Felix Holm, Tobias Czempiel, Nassir Navab, Benjamin Busam

Although using significantly fewer labels during training, we achieve 74. 12\% of the location-supervised SOTA performance on Visual Genome and even outperform the best method on 4D-OR.

Graph Generation Scene Graph Generation

Rotation-Invariant Transformer for Point Cloud Matching

1 code implementation CVPR 2023 Hao Yu, Zheng Qin, Ji Hou, Mahdi Saleh, Dongsheng Li, Benjamin Busam, Slobodan Ilic

To this end, we introduce RoITr, a Rotation-Invariant Transformer to cope with the pose variations in the point cloud matching task.

Data Augmentation Decoder

Ultra-NeRF: Neural Radiance Fields for Ultrasound Imaging

1 code implementation25 Jan 2023 Magdalena Wysocki, Mohammad Farid Azampour, Christine Eilers, Benjamin Busam, Mehrdad Salehi, Nassir Navab

In our work, we discuss direction-dependent changes in the scene and show that a physics-inspired rendering improves the fidelity of US image synthesis.

Image Generation Neural Rendering

TexPose: Neural Texture Learning for Self-Supervised 6D Object Pose Estimation

no code implementations CVPR 2023 Hanzhi Chen, Fabian Manhardt, Nassir Navab, Benjamin Busam

In this paper, we introduce neural texture learning for 6D object pose estimation from synthetic data and a few unlabelled real images.

6D Pose Estimation using RGB

OPA-3D: Occlusion-Aware Pixel-Wise Aggregation for Monocular 3D Object Detection

no code implementations2 Nov 2022 Yongzhi Su, Yan Di, Fabian Manhardt, Guangyao Zhai, Jason Rambach, Benjamin Busam, Didier Stricker, Federico Tombari

Despite monocular 3D object detection having recently made a significant leap forward thanks to the use of pre-trained depth estimators for pseudo-LiDAR recovery, such two-stage methods typically suffer from overfitting and are incapable of explicitly encapsulating the geometric relation between depth and object bounding box.

Monocular 3D Object Detection Object +1

RIGA: Rotation-Invariant and Globally-Aware Descriptors for Point Cloud Registration

1 code implementation27 Sep 2022 Hao Yu, Ji Hou, Zheng Qin, Mahdi Saleh, Ivan Shugurov, Kai Wang, Benjamin Busam, Slobodan Ilic

More specifically, 3D structures of the whole frame are first represented by our global PPF signatures, from which structural descriptors are learned to help geometric descriptors sense the 3D world beyond local regions.

Point Cloud Registration

Segmenting Known Objects and Unseen Unknowns without Prior Knowledge

no code implementations ICCV 2023 Stefano Gasperini, Alvaro Marcos-Ramiro, Michael Schmidt, Nassir Navab, Benjamin Busam, Federico Tombari

By doing so, for the first time in panoptic segmentation with unknown objects, our U3HS is trained without unknown categories, reducing assumptions and leaving the settings as unconstrained as in real-life scenarios.

Panoptic Segmentation Scene Understanding +1

Disentangling 3D Attributes from a Single 2D Image: Human Pose, Shape and Garment

no code implementations5 Aug 2022 Xue Hu, Xinghui Li, Benjamin Busam, Yiren Zhou, Ales Leonardis, Shanxin Yuan

Specifically, we focus on human appearance and learn implicit pose, shape and garment representations of dressed humans from RGB images.

3D Reconstruction Decoder +1

CloudAttention: Efficient Multi-Scale Attention Scheme For 3D Point Cloud Learning

no code implementations31 Jul 2022 Mahdi Saleh, Yige Wang, Nassir Navab, Benjamin Busam, Federico Tombari

The proposed hierarchical model achieves state-of-the-art shape classification in mean accuracy and yields results on par with the previous segmentation methods while requiring significantly fewer computations.

Scene Segmentation Segmentation

DA$^2$ Dataset: Toward Dexterity-Aware Dual-Arm Grasping

no code implementations31 Jul 2022 Guangyao Zhai, Yu Zheng, Ziwei Xu, Xin Kong, Yong liu, Benjamin Busam, Yi Ren, Nassir Navab, Zhengyou Zhang

In this paper, we introduce DA$^2$, the first large-scale dual-arm dexterity-aware dataset for the generation of optimal bimanual grasping pairs for arbitrary large objects.

BFS-Net: Weakly Supervised Cell Instance Segmentation from Bright-Field Microscopy Z-Stacks

no code implementations9 Jun 2022 Shervin Dehghani, Benjamin Busam, Nassir Navab, Ali Nasseri

Despite its broad availability, volumetric information acquisition from Bright-Field Microscopy (BFM) is inherently difficult due to the projective nature of the acquisition process.

Instance Segmentation Semantic Segmentation

OSOP: A Multi-Stage One Shot Object Pose Estimation Framework

no code implementations CVPR 2022 Ivan Shugurov, Fu Li, Benjamin Busam, Slobodan Ilic

We present a novel one-shot method for object detection and 6 DoF pose estimation, that does not require training on target objects.

Object object-detection +2

CroMo: Cross-Modal Learning for Monocular Depth Estimation

no code implementations CVPR 2022 Yannick Verdié, Jifei Song, Barnabé Mas, Benjamin Busam, Aleš Leonardis, Steven McDonagh

Learning-based depth estimation has witnessed recent progress in multiple directions; from self-supervision using monocular video to supervised methods offering highest accuracy.

Monocular Depth Estimation

Know your sensORs -- A Modality Study For Surgical Action Classification

no code implementations16 Mar 2022 Lennart Bastian, Tobias Czempiel, Christian Heiliger, Konrad Karcz, Ulrich Eck, Benjamin Busam, Nassir Navab

Existing datasets from OR room cameras are thus far limited in size or modalities acquired, leaving it unclear which sensor modalities are best suited for tasks such as recognizing surgical action from videos.

Action Classification Action Recognition +1

Wild ToFu: Improving Range and Quality of Indirect Time-of-Flight Depth with RGB Fusion in Challenging Environments

no code implementations7 Dec 2021 HyunJun Jung, Nikolas Brasch, Ales Leonardis, Nassir Navab, Benjamin Busam

Indirect Time-of-Flight (I-ToF) imaging is a widespread way of depth estimation for mobile devices due to its small size and affordable price.

Depth Estimation Depth Prediction

R4Dyn: Exploring Radar for Self-Supervised Monocular Depth Estimation of Dynamic Scenes

no code implementations10 Aug 2021 Stefano Gasperini, Patrick Koch, Vinzenz Dallabetta, Nassir Navab, Benjamin Busam, Federico Tombari

While self-supervised monocular depth estimation in driving scenarios has achieved comparable performance to supervised approaches, violations of the static world assumption can still lead to erroneous depth predictions of traffic participants, posing a potential safety issue.

Autonomous Vehicles Monocular Depth Estimation

OperA: Attention-Regularized Transformers for Surgical Phase Recognition

no code implementations5 Mar 2021 Tobias Czempiel, Magdalini Paschali, Daniel Ostler, Seong Tae Kim, Benjamin Busam, Nassir Navab

In this paper we introduce OperA, a transformer-based model that accurately predicts surgical phases from long video sequences.

Surgical phase recognition

I Like to Move It: 6D Pose Estimation as an Action Decision Process

no code implementations26 Sep 2020 Benjamin Busam, Hyun Jun Jung, Nassir Navab

We change this paradigm and reformulate the problem as an action decision process where an initial pose is updated in incremental discrete steps that sequentially move a virtual 3D rendering towards the correct solution.

6D Pose Estimation Object +3

DynaMiTe: A Dynamic Local Motion Model with Temporal Constraints for Robust Real-Time Feature Matching

no code implementations31 Jul 2020 Patrick Ruhkamp, Ruiqi Gong, Nassir Navab, Benjamin Busam

Feature based visual odometry and SLAM methods require accurate and fast correspondence matching between consecutive image frames for precise camera pose estimation in real-time.

Descriptive Pose Estimation +1

HDD-Net: Hybrid Detector Descriptor with Mutual Interactive Learning

1 code implementation12 May 2020 Axel Barroso-Laguna, Yannick Verdie, Benjamin Busam, Krystian Mikolajczyk

Local feature extraction remains an active research area due to the advances in fields such as SLAM, 3D reconstructions, or AR applications.

3D Reconstruction

A Multi-Hypothesis Approach to Color Constancy

1 code implementation CVPR 2020 Daniel Hernandez-Juarez, Sarah Parisot, Benjamin Busam, Ales Leonardis, Gregory Slabaugh, Steven McDonagh

Firstly, we select a set of candidate scene illuminants in a data-driven fashion and apply them to a target image to generate of set of corrected images.

Color Constancy

SteReFo: Efficient Image Refocusing with Stereo Vision

no code implementations29 Sep 2019 Benjamin Busam, Matthieu Hog, Steven McDonagh, Gregory Slabaugh

Whether to attract viewer attention to a particular object, give the impression of depth or simply reproduce human-like scene perception, shallow depth of field images are used extensively by professional and amateur photographers alike.

Depth Estimation

Generic Primitive Detection in Point Clouds Using Novel Minimal Quadric Fits

no code implementations4 Jan 2019 Tolga Birdal, Benjamin Busam, Nassir Navab, Slobodan Ilic, Peter Sturm

Based upon the idea of aligning the quadric gradients with the surface normals, our first formulation is exact and requires as low as four oriented points.

Explaining the Ambiguity of Object Detection and 6D Pose From Visual Data

no code implementations ICCV 2019 Fabian Manhardt, Diego Martin Arroyo, Christian Rupprecht, Benjamin Busam, Tolga Birdal, Nassir Navab, Federico Tombari

For each object instance we predict multiple pose and class outcomes to estimate the specific pose distribution generated by symmetries and repetitive textures.

3D Object Detection Object +3

A Minimalist Approach to Type-Agnostic Detection of Quadrics in Point Clouds

no code implementations CVPR 2018 Tolga Birdal, Benjamin Busam, Nassir Navab, Slobodan Ilic, Peter Sturm

As opposed to state-of-the-art, where a tailored algorithm treats each primitive type separately, we propose to encapsulate all types in a single robust detection procedure.

Scene Understanding

Camera Pose Filtering with Local Regression Geodesics on the Riemannian Manifold of Dual Quaternions

no code implementations24 Apr 2017 Benjamin Busam, Tolga Birdal, Nassir Navab

Time-varying, smooth trajectory estimation is of great interest to the vision community for accurate and well behaving 3D systems.

Pose Tracking regression +1

Cannot find the paper you are looking for? You can Submit a new open access paper.