Conversational AI tools that can generate and discuss clinically correct radiology reports for a given medical image have the potential to transform radiology.
no code implementations • 20 Nov 2023 • Mayar Lotfy, Anna Alperovich, Tommaso Giannantonio, Bjorn Barz, Xiaohan Zhang, Felix Holm, Nassir Navab, Felix Boehm, Carolin Schwamborn, Thomas K. Hoffmann, Patrick J. Schuler
Despite the limited dataset, the GNN-based model significantly outperforms context-agnostic approaches, accurately distinguishing between healthy and tumor tissues, even in images from previously unseen patients.
These geometric features are then point-aligned with DINOv2 features to establish a consistent object representation under SE(3) transformations, facilitating the mapping from camera space to the pre-defined canonical space, thus further enhancing pose estimation.
Robotic ophthalmic surgery is an emerging technology to facilitate high-precision interventions such as retina penetration in subretinal injection and removal of floating tissues in retinal detachment depending on the input imaging modalities such as microscopy and intraoperative OCT (iOCT).
Creating high-quality view synthesis is essential for immersive applications but continues to be problematic, particularly in indoor environments and for real-time deployment.
We take advantage of the outer part of the masked area as they have a direct correlation with the context of the scene.
To date, endovascular surgeries are performed using the golden standard of Fluoroscopy, which uses ionising radiation to visualise catheters and vasculature.
Surgical videos captured from microscopic or endoscopic imaging devices are rich but complex sources of information, depicting different tools and anatomical structures utilized during an extended amount of time.
In this paper, we present SG-Bot, a novel rearrangement framework that utilizes a coarse-to-fine scheme with a scene graph as the scene representation.
While most classical methods achieve rotation-equivariant detection and invariant description by design, many learning-based approaches learn to be robust only up to a certain degree.
Dynamic reconstruction with neural radiance fields (NeRF) requires accurate camera poses.
In this work, we propose the first precise hand-object reconstruction method in hyperbolic space, namely Dynamic Hyperbolic Attention Network (DHANet), which leverages intrinsic properties of hyperbolic space to learn representative features.
Light-sheet fluorescence microscopy (LSFM), a planar illumination technique that enables high-resolution imaging of samples, experiences defocused image quality caused by light scattering when photons propagate through thick tissues.
We demonstrate that our multi-modal registration framework can localize images on the 3D surface topology of a patient-specific organ and the mean shape of an SSM.
We conduct extensive experiments across a variety of scenarios on data from KITTI, Waymo, and CrashD for 3D object detection, and on data from SemanticKITTI, Waymo, and nuScenes for 3D semantic segmentation.
Here, we propose a rehearsal-based continual learning approach for class incremental and domain incremental scenarios in white blood cell classification.
To that end, we train multiple MIL models using different levels of sex imbalance in the training set and excluding certain age groups.
This paper addresses the limitations of current datasets for 3D vision tasks in terms of accuracy, size, realism, and suitable imaging modalities for photometrically challenging objects.
6D pose estimation pipelines that rely on RGB-only or RGB-D data show limitations for photometrically challenging objects with e. g. textureless surfaces, reflections or transparency.
Therefore, we propose to give the confidence maps as additional information to the networks.
While state-of-the-art monocular depth estimation approaches achieve impressive results in ideal settings, they are highly unreliable under challenging illumination and weather conditions, such as at nighttime or in the presence of rain.
Leveraging this, we introduce DISBELIEVE, a local model poisoning attack that creates malicious parameters or gradients such that their distance to benign clients' parameters or gradients is low respectively but at the same time their adverse effect on the global model's performance is high.
Accurate catheter tracking is crucial during minimally invasive endovascular procedures (MIEP), and electromagnetic (EM) tracking is a widely used technology that serves this purpose.
The recovery of morphologically accurate anatomical images from deformed ones is challenging in ultrasound (US) image acquisition, but crucial to accurate and consistent diagnosis, particularly in the emerging field of computer-assisted diagnosis.
Anatomical segmentation of organs in ultrasound images is essential to many clinical applications, particularly for diagnosis and monitoring.
SurgVLP constructs a new contrastive learning objective to align video clip embeddings with the corresponding multiple text embeddings by bringing them together within a joint latent space.
Methods: RGB and depth images from multiple cameras are fused into a 3D point cloud representation of the scene.
Our method is several orders of magnitude faster than local patch-based metrics and can be directly applied in clinical settings by replacing the similarity measure with the proposed one.
However, there is limited research on automating structured reporting, and no public benchmark is available for evaluating and comparing different methods.
Ranked #1 on Structured Report Generation on Rad-ReStruct
The results demonstrated that the proposed advanced framework can robustly work on a variety of seen and unseen phantoms as well as in-vivo human carotid data.
To address this challenge, a graph-based non-rigid registration is proposed to enable transferring planned paths from the atlas to the current setup by explicitly considering subcutaneous bone surface features instead of the skin surface.
To validate the proposed robotic US system for imaging arteries, experiments are carried out on volunteers' carotid and radial arteries.
This work intends to, first, propose a robust inpainting model to learn the details of healthy anatomies and reconstruct high-resolution images by preserving anatomical constraints.
Dynamic positron emission tomography imaging (dPET) provides temporally resolved images of a tracer enabling a quantitative measure of physiological processes.
The experimental results demonstrate that the proposed approach with the re-identification process can significantly improve the accuracy and robustness of the segmentation results (dice score: from 0:54 to 0:86; intersection over union: from 0:47 to 0:78).
Autonomous ultrasound (US) scanning has attracted increased attention, and it has been seen as a potential solution to overcome the limitations of conventional US examinations, such as inter-operator variations.
A particular focus in computer-assisted surgery is to replace marker-based tracking systems for instrument localization with pure image-based 6DoF pose estimation.
Although the preservation of shape continuity and physiological anatomy is a natural assumption in the segmentation of medical images, it is often neglected by deep learning methods that mostly aim for the statistical modeling of input data as pixels rather than interconnected structures.
Although purely transformer-based architectures showed promising performance in many computer vision tasks, many hybrid models consisting of CNN and transformer blocks are introduced to fit more specialized tasks.
To address this limitation, we propose a novel guidance approach for the sampling process in the diffusion model that leverages bounding box and segmentation map information at inference time without additional training data.
Statistical shape models (SSMs) are an established way to represent the anatomy of a population with various clinically relevant applications.
In this work, we investigate the need for endoscopy domain-specific pretraining based on downstream objectives.
1 code implementation • • HyunJun Jung, Patrick Ruhkamp, Guangyao Zhai, Nikolas Brasch, Yitong Li, Yannick Verdie, Jifei Song, Yiren Zhou, Anil Armagan, Slobodan Ilic, Ales Leonardis, Nassir Navab, Benjamin Busam
Learning-based methods to solve dense 3D vision problems typically train on 3D sensor data.
The extraction of structured clinical information from free-text radiology reports in the form of radiology graphs has been demonstrated to be a valuable approach for evaluating the clinical correctness of report-generation methods.
The holistic representation of surgical scenes as semantic scene graphs (SGG), where entities are represented as nodes and relations between them as edges, is a promising direction for fine-grained semantic OR understanding.
Ranked #2 on Scene Graph Generation on 4D-OR
Automated diagnosis prediction from medical images is a valuable resource to support clinical decision-making.
We validate the generalizability of the proposed domain-independent segmentation approach on several datasets with varying parameters and machines.
no code implementations • 21 Mar 2023 • Matthias Keicher, Matan Atad, David Schinz, Alexandra S. Gersing, Sarah C. Foreman, Sophia S. Goller, Juergen Weissinger, Jon Rischewski, Anna-Sophia Dietrich, Benedikt Wiestler, Jan S. Kirschke, Nassir Navab
We then regress the severity of the fracture as a function of the distance to this hyperplane, calibrating the results to the Genant scale.
Although using significantly fewer labels during training, we achieve 74. 12\% of the location-supervised SOTA performance on Visual Genome and even outperform the best method on 4D-OR.
Ranked #1 on Scene Graph Generation on 4D-OR
Image synthesis driven by computer graphics achieved recently a remarkable realism, yet synthetic image data generated this way reveals a significant domain gap with respect to real-world data.
Explainability is a key requirement for computer-aided diagnosis systems in clinical decision-making.
Querying the expert to annotate regions of interest in a WSI guides the formation of high-attention regions for MIL.
Recent MIL approaches produce highly informative bag level representations by utilizing the transformer architecture's ability to model the dependencies between instances.
Reliable multi-agent trajectory prediction is crucial for the safe planning and control of autonomous systems.
2 code implementations • 13 Feb 2023 • Chinedu Innocent Nwoye, Tong Yu, Saurav Sharma, Aditya Murali, Deepak Alapatt, Armine Vardazaryan, Kun Yuan, Jonas Hajek, Wolfgang Reiter, Amine Yamlahi, Finn-Henri Smidt, Xiaoyang Zou, Guoyan Zheng, Bruno Oliveira, Helena R. Torres, Satoshi Kondo, Satoshi Kasai, Felix Holm, Ege Özsoy, Shuangchun Gui, Han Li, Sista Raviteja, Rachana Sathish, Pranav Poudel, Binod Bhattarai, Ziheng Wang, Guo Rui, Melanie Schellenberg, João L. Vilaça, Tobias Czempiel, Zhenkun Wang, Debdoot Sheet, Shrawan Kumar Thapa, Max Berniker, Patrick Godau, Pedro Morais, Sudarshan Regmi, Thuy Nuong Tran, Jaime Fonseca, Jan-Hinrich Nölke, Estevão Lima, Eduard Vazquez, Lena Maier-Hein, Nassir Navab, Pietro Mascagni, Barbara Seeliger, Cristians Gonzalez, Didier Mutter, Nicolas Padoy
This paper presents the CholecTriplet2022 challenge, which extends surgical action triplet modeling from recognition to detection.
Ranked #1 on Action Triplet Detection on CholecT50 (Challenge)
However, the various types of breast tissue, such as glandular, fat, and lesions, differ in sound speed.
Kinematic data of a colonoscope and the colon, including positions and directions of their centerlines, are obtained using electromagnetic and depth sensors.
This in turn enables our method to employ a one-stage upsampling paradigm without the need for coarse and fine reconstruction.
In our work, we discuss direction-dependent changes in the scene and show that a physics-inspired rendering improves the fidelity of US image synthesis.
In this work, we propose a framework for autonomous robotic navigation for subretinal injection, based on intelligent real-time processing of iOCT volumes.
In this paper, we introduce neural texture learning for 6D object pose estimation from synthetic data and a few unlabelled real images.
We introduce a zero-shot split for Tabletop Objects Dataset (TOD-Z) to enable this study and present a method that uses annotated objects to learn the ``objectness'' of pixels and generalize to unseen object categories in cluttered indoor environments.
no code implementations • 20 Dec 2022 • HyunJun Jung, Shun-Cheng Wu, Patrick Ruhkamp, Guangyao Zhai, Hannah Schieber, Giulia Rizzoli, Pengyuan Wang, Hongcheng Zhao, Lorenzo Garattoni, Sven Meier, Daniel Roth, Nassir Navab, Benjamin Busam
Estimating the 6D pose of objects is a major 3D computer vision problem.
Graph representation of objects and their relations in a scene, known as a scene graph, provides a precise and discernible interface to manipulate a scene by modifying the nodes or the edges in the graph.
In contrast to previously proposed fully convolutional models, the proposed model implements residual Squeeze and Excitation modules in the generator architecture.
The purpose of this work is to investigate the hypothesis that we can predict image quality based on its latent representation in the GANs bottleneck.
6-DoF robotic grasping is a long-lasting but unsolved problem.
By doing so, for the first time in panoptic segmentation with unknown objects, our U3HS is trained without unknown categories, reducing assumptions and leaving the settings as unconstrained as in real-life scenarios.
Robotic ultrasound (US) imaging aims at overcoming some of the limitations of free-hand US examinations, e. g. difficulty in guaranteeing intra- and inter-operator repeatability.
To enable this, we make use of realistic ultrasound simulation techniques that allow for instantiation of several independent speckle realizations that represent the exact same tissue, thus allowing for the application of image reconstruction techniques that work with pairs of differently corrupted data.
In this paper, we introduce DA$^2$, the first large-scale dual-arm dexterity-aware dataset for the generation of optimal bimanual grasping pairs for arbitrary large objects.
The proposed hierarchical model achieves state-of-the-art shape classification in mean accuracy and yields results on par with the previous segmentation methods while requiring significantly fewer computations.
Various morphological and functional parameters of peripheral nerves and their vascular supply are indicative of pathological changes due to injury or disease.
By considering the consistency information with the diversity in the consistency-based embedding scheme, the proposed method could select more informative samples for labeling in the semi-supervised learning setting.
We find that our proposed pre-training methods help in modeling the data at a patient and population level and improve performance in different fine-tuning tasks on all datasets.
Abdominal aortic aneurysm (AAA) is a vascular disease in which a section of the aorta enlarges, weakening its walls and potentially rupturing the vessel.
Deep learning models used in medical image analysis are prone to raising reliability concerns due to their black-box nature.
Inpainting has recently been proposed as a successful deep learning technique for unsupervised medical image model discovery.
Although most medical centers conduct similar medical imaging tasks, their differences, such as specializations, number of patients, and devices, lead to distinctive data distributions.
Here, we propose a cross-domain adapted autoencoder to extract features in an unsupervised manner on three different datasets of single white blood cells scanned from peripheral blood smears.
Light-sheet fluorescence microscopy (LSFM) is a cutting-edge volumetric imaging technique that allows for three-dimensional imaging of mesoscopic samples with decoupled illumination and detection paths.
To this end, we propose a multi-task method based on U-Net that takes T1-weighted MR images as an input to generate synthetic FDG-PET images and classifies the dementia progression of the patient into cognitive normal (CN), cognitive impairment (MCI), and AD.
We validate TriMix on eight benchmark datasets consisting of natural and medical images with an improvement of 2. 71% and 0. 41% better than the second-best models for both data types.
We propose a novel method to automatically calibrate tracked ultrasound probes.
Despite its broad availability, volumetric information acquisition from Bright-Field Microscopy (BFM) is inherently difficult due to the projective nature of the acquisition process.
Object pose estimation is crucial for robotic applications and augmented reality.
no code implementations • 16 May 2022 • Bailiang Jian, Mohammad Farid Azampour, Francesca De Benetti, Johannes Oberreuter, Christina Bukas, Alexandra S. Gersing, Sarah C. Foreman, Anna-Sophia Dietrich, Jon Rischewski, Jan S. Kirschke, Nassir Navab, Thomas Wendler
We specifically design these losses to depend only on the CT label maps since automatic vertebra segmentation in CT gives more accurate results contrary to MRI.
The results demonstrate that proposed approach can effectively and accurately navigate the probe towards the longitudinal view of vessels.
With the advent of sophisticated machine learning (ML) techniques and the promising results they yield, especially in medical applications, where they have been investigated for different tasks to enhance the decision-making process.
Automated segmentation of retinal optical coherence tomography (OCT) images has become an important recent direction in machine learning for medical applications.
Ranked #1 on Retinal OCT Layer Segmentation on Duke SD-OCT (using extra training data)
One challenging property lurking in medical datasets is the imbalanced data distribution, where the frequency of the samples between the different classes is not balanced.
In this work, we propose Graph-in-Graph (GiG), a neural network architecture for protein classification and brain imaging applications that exploits the graph representation of the input data samples and their latent relation.
1 code implementation • 30 Mar 2022 • Paul Engstler, Matthias Keicher, David Schinz, Kristina Mach, Alexandra S. Gersing, Sarah C. Foreman, Sophia S. Goller, Juergen Weissinger, Jon Rischewski, Anna-Sophia Dietrich, Benedikt Wiestler, Jan S. Kirschke, Ashkan Khakzar, Nassir Navab
Do black-box neural network models learn clinically relevant features for fracture diagnosis?
The automation of chest X-ray reporting has garnered significant interest due to the time-consuming nature of the task.
We show that training the agent against the prediction model can significantly improve the semantic features extracted for downstream classification tasks.
We test our method on two medical datasets of patient records, TADPOLE and MIMIC-III, including imaging and non-imaging features and different prediction tasks.
Ranked #1 on Length-of-Stay prediction on MIMIC-III
In this work, we propose a novel data augmentation method for clinical audio datasets based on a conditional Wasserstein Generative Adversarial Network with Gradient Penalty (cWGAN-GP), operating on log-mel spectrograms.
Towards this goal, for the first time, we propose using semantic scene graphs (SSG) to describe and summarize the surgical scene.
Ranked #3 on Scene Graph Generation on 4D-OR
For this purpose, longitudinal self-supervision schemes are explored on clinical longitudinal COVID-19 CT scans.
Dense methods also improved pose estimation in the presence of occlusion.
Algorithmic surgical workflow recognition is an ongoing research field and can be divided into laparoscopic (Internal) and operating room (External) analysis.
Existing datasets from OR room cameras are thus far limited in size or modalities acquired, leaving it unclear which sensor modalities are best suited for tasks such as recognizing surgical action from videos.
While 6D object pose estimation has recently made a huge leap forward, most methods can still only handle a single or a handful of different objects, which limits their applications.
Ranked #1 on 6D Pose Estimation on LineMOD (Mean ADD-S metric)
There have been numerous recently proposed methods for monocular depth prediction (MDP) coupled with the equally rapid evolution of benchmarking tools.
The video action segmentation task is regularly explored under weaker forms of supervision, such as transcript supervision, where a list of actions is easier to obtain than dense frame-wise labels.
We propose a novel variational Bayesian formulation for diffeomorphic non-rigid registration of medical images, which learns in an unsupervised way a data-specific similarity metric.
Despite training only on a standard dataset, such as KITTI, augmenting with our vector fields significantly improves the generalization to differently shaped objects and scenes.
Indirect Time-of-Flight (I-ToF) imaging is a widespread way of depth estimation for mobile devices due to its small size and affordable price.
We first present a small sequence of RGB-D images displaying a human-object interaction.
With the advent of deep learning, estimating depth from a single RGB image has recently received a lot of attention, being capable of empowering many different applications ranging from path planning for robotics to computational cinematography.
For this purpose, we present a platform for autonomous trocar docking that combines computer vision and a robotic setup.
We propose MIGS (Meta Image Generation from Scene Graphs), a meta-learning based approach for few-shot image generation from graphs that enables adapting the model to different scenes and increases the image quality by training on diverse sets of tasks.
A novel temporal attention mechanism further processes the local geometric information in a global context across consecutive images.
Accurate and reliable localization is a fundamental requirement for autonomous vehicles to use map information in higher-level tasks such as navigation or planning.
Estimating the uncertainty of a neural network plays a fundamental role in safety-critical settings.
We propose a method to identify features with predictive information in the input domain.
Existing automatic and interactive segmentation models for medical images only use data from a single time point (static).
The results of our experiments show that the proposed method improves the network's performance on real images by a considerable margin and can be employed in 3D reconstruction pipelines.
In this work, we present MetaMedSeg, a gradient-based meta-learning algorithm that redefines the meta-learning task for the volumetric medical data with the goal to capture the variety between the slices.
Sickle cell disease (SCD) is a severe genetic hemoglobin disorder that results in premature destruction of red blood cells.
Scene graphs are representations of a scene, composed of objects (nodes) and inter-object relationships (edges), proven to be particularly suited for this task, as they allow for semantic control on the generated content.
Directly regressing all 6 degrees-of-freedom (6DoF) for the object pose (e. g. the 3D rotation and translation) in a cluttered environment from a single RGB image is a challenging problem.
Ranked #1 on 6D Pose Estimation using RGB on Occlusion LineMOD
Scene graphs, composed of nodes as objects and directed-edges as relationships among objects, offer an alternative representation of a scene that is more semantically grounded than images.
no code implementations • 10 Aug 2021 • Markus Krönke, Christine Eilers, Desislava Dimova, Melanie Köhler, Gabriel Buschner, Lilit Mirzojan, Lemonia Konstantinidou, Marcus R. Makowski, James Nagarajah, Nassir Navab, Wolfgang Weber, Thomas Wendler
Conclusion: Tracked 3D ultrasound combined with a CNN segmentation significantly reduces interobserver variability in thyroid volumetry and increases the accuracy of the measurements with shorter acquisition times.
While self-supervised monocular depth estimation in driving scenarios has achieved comparable performance to supervised approaches, violations of the static world assumption can still lead to erroneous depth predictions of traffic participants, posing a potential safety issue.
no code implementations • 29 Jul 2021 • Matthias Keicher, Hendrik Burwinkel, David Bani-Harouni, Magdalini Paschali, Tobias Czempiel, Egon Burian, Marcus R. Makowski, Rickmer Braren, Nassir Navab, Thomas Wendler
Specifically, we introduce a multimodal similarity metric to build a population graph for clustering patients and an image-based end-to-end Graph Attention Network to process this graph and predict the COVID-19 patient outcomes: admission to ICU, need for ventilation and mortality.
Derived regions are consistent across different images and coincide with human-defined semantic classes on some datasets.
In this work, we introduce Deep Direct Volume Rendering (DeepDVR), a generalization of DVR that allows for the integration of deep neural networks into the DVR algorithm.
We then use MSSG to introduce a dynamically generated graphical user interface tool for surgical procedure analysis which could be used for many applications including process optimization, OR design and automatic report generation.
Medical Ultrasound (US), despite its wide use, is characterized by artifacts and operator dependency.
The soft pseudo-labels are then used to train a deep student network for disease prediction of unseen test data for which the graph modality is unavailable.
We present our findings using publicly available chest pathologies (CheXpert, NIH ChestX-ray8) and COVID-19 datasets (BrixIA, and COVID-19 chest X-ray segmentation dataset).
Neural networks have demonstrated remarkable performance in classification and regression tasks on chest X-rays.
Is critical input information encoded in specific sparse pathways within the neural network?
The main novelty lies in the interpretable attention module (IAM), which directly operates on multi-modal features.
Scene graphs are a compact and explicit representation successfully used in a variety of 2D scene understanding tasks.
Ranked #1 on 3D Object Classification on 3R-Scan
Disentangled representations can be useful in many downstream tasks, help to make deep learning models more interpretable, and allow for control over features of synthetically generated images that can be useful in training other models that require a large number of labelled or unlabelled data.
Hereditary hemolytic anemias are genetic disorders that affect the shape and density of red blood cells.
The method leverages the availability of labelled data in a different domain.
Chest computed tomography (CT) has played an essential diagnostic role in assessing patients with COVID-19 by showing disease-specific image features such as ground-glass opacity and consolidation.
1 code implementation • 12 Mar 2021 • Christina Bukas, Bailiang Jian, Luis F. Rodriguez Venegas, Francesca De Benetti, Sebastian Ruehling, Anjany Sekuboyina, Jens Gempt, Jan S. Kirschke, Marie Piraud, Johannes Oberreuter, Nassir Navab, Thomas Wendler
The framework uses the patient CT scan and the fractured vertebra label to build a virtual healthy spine using a high-level approach.
no code implementations • 10 Mar 2021 • Florian Kofler, Ivan Ezhov, Fabian Isensee, Fabian Balsiger, Christoph Berger, Maximilian Koerner, Beatrice Demiray, Julia Rackerseder, Johannes Paetzold, Hongwei Li, Suprosanna Shit, Richard McKinley, Marie Piraud, Spyridon Bakas, Claus Zimmer, Nassir Navab, Jan Kirschke, Benedikt Wiestler, Bjoern Menze
It is often unclear how to optimize abstract metrics, such as human expert perception, in convolutional neural network (CNN) training.
In this paper we introduce OperA, a transformer-based model that accurately predicts surgical phases from long video sequences.
With few annotated data, FedPerl is on par with a state-of-the-art method in skin lesion classification in the standard setup while outperforming SSFLs and the baselines by 1. 8% and 15. 8%, respectively.
This is accomplished by associating a graph-based neural network to each class, which is responsible for weighting the class samples and changing the importance of each sample for the classifier.
For the former we contributed our own dataset composed of five indoor scenes where it is unavoidable to capture images corresponding to views that are hard to uniquely identify.
In this context, we proposed a segmentation refinement method based on uncertainty analysis and graph convolutional networks.
In this work, we empirically show that two approaches for handling the gradient information, namely positive aggregation, and positive propagation, break these methods.
First, a neural network is trained once to detect a set of anatomical landmarks on simulated X-rays.
To address these issues, recently, unsupervised deep anomaly detection methods that train the model on large-sized normal scans and detect abnormal scans by calculating reconstruction error have been reported.
no code implementations • 30 Oct 2020 • Lena Maier-Hein, Matthias Eisenmann, Duygu Sarikaya, Keno März, Toby Collins, Anand Malpani, Johannes Fallert, Hubertus Feussner, Stamatia Giannarou, Pietro Mascagni, Hirenkumar Nakawala, Adrian Park, Carla Pugh, Danail Stoyanov, Swaroop S. Vedula, Kevin Cleary, Gabor Fichtinger, Germain Forestier, Bernard Gibaud, Teodor Grantcharov, Makoto Hashizume, Doreen Heckmann-Nötzel, Hannes G. Kenngott, Ron Kikinis, Lars Mündermann, Nassir Navab, Sinan Onogur, Raphael Sznitman, Russell H. Taylor, Minu D. Tizabi, Martin Wagner, Gregory D. Hager, Thomas Neumuth, Nicolas Padoy, Justin Collins, Ines Gockel, Jan Goedeke, Daniel A. Hashimoto, Luc Joyeux, Kyle Lam, Daniel R. Leff, Amin Madani, Hani J. Marcus, Ozanan Meireles, Alexander Seitel, Dogu Teber, Frank Ückert, Beat P. Müller-Stich, Pierre Jannin, Stefanie Speidel
We further complement this technical perspective with (4) a review of currently available SDS products and the translational progress from academia and (5) a roadmap for faster clinical translation and exploitation of the full potential of SDS, based on an international multi-round Delphi process.
Panoptic segmentation has recently unified semantic and instance segmentation, previously addressed separately, thus taking a step further towards creating more comprehensive and efficient perception systems.
We propose a framework that ameliorates this issue by performing scene reconstruction and semantic scene completion jointly in an incremental and real-time manner, based on an input sequence of depth maps.
3D Point clouds are a rich source of information that enjoy growing popularity in the vision community.
This work proposes a RGB-D SLAM system specifically designed for structured environments and aimed at improved tracking and mapping accuracy by relying on geometric features that are extracted from the surrounding.