Search Results for author: Didier Mutter

Found 37 papers, 18 papers with code

fine-CLIP: Enhancing Zero-Shot Fine-Grained Surgical Action Recognition with Vision-Language Models

no code implementations25 Mar 2025 Saurav Sharma, Didier Mutter, Nicolas Padoy

While vision-language models like CLIP have advanced zero-shot surgical phase recognition, they struggle with fine-grained surgical activities, especially action triplets.

Action Recognition Surgical phase recognition +2

Learning from Synchronization: Self-Supervised Uncalibrated Multi-View Person Association in Challenging Scenes

1 code implementation17 Mar 2025 Keqi Chen, Vinkle Srivastav, Didier Mutter, Nicolas Padoy

Specifically, we propose a self-supervised learning framework, consisting of an encoder-decoder model and a self-supervised pretext task, cross-view image synchronization, which aims to distinguish whether two images from different views are captured at the same time.

Person Re-Identification Self-Supervised Learning

S4M: Segment Anything with 4 Extreme Points

no code implementations7 Mar 2025 Adrien Meyer, Lorenzo Arboit, Giuseppe Massimiani, Francesco Brucchi, Luca Emanuele Amodio, Didier Mutter, Nicolas Padoy

To address this, we introduce S4M (Segment Anything with 4 Extreme Points), which augments SAM by leveraging extreme points -- the top-, bottom-, left-, and right-most points of an instance -- prompts.

Image Segmentation Instance Segmentation +1

Multi-view Video-Pose Pretraining for Operating Room Surgical Activity Recognition

no code implementations19 Feb 2025 Idris Hamoud, Vinkle Srivastav, Muhammad Abdullah Jamal, Didier Mutter, Omid Mohareri, Nicolas Padoy

We highlight the benefits of our approach for surgical activity recognition in both multi-view and single-view settings, showcasing its practical applicability in complex surgical environments.

Activity Recognition

When do they StOP?: A First Step Towards Automatically Identifying Team Communication in the Operating Room

1 code implementation12 Feb 2025 Keqi Chen, Lilien Schewski, Vinkle Srivastav, Joël Lavanchy, Didier Mutter, Guido Beldi, Sandra Keller, Nicolas Padoy

Conclusion: We investigate the Team Time-Out and the StOP?-protocol in the OR, by presenting the first OR dataset with temporal annotations of group activities protocols, and introducing a novel group activity detection approach that outperforms existing approaches.

Action Detection Activity Detection +1

UltraSam: A Foundation Model for Ultrasound using Large Open-Access Segmentation Datasets

1 code implementation25 Nov 2024 Adrien Meyer, Aditya Murali, Didier Mutter, Nicolas Padoy

Methods: We compile US-43d, a large-scale collection of 43 open-access ultrasound datasets with over 280, 000 images and segmentation masks for more than 50 anatomical structures.

Segmentation

The Endoscapes Dataset for Surgical Scene Segmentation, Object Detection, and Critical View of Safety Assessment: Official Splits and Benchmark

1 code implementation19 Dec 2023 Aditya Murali, Deepak Alapatt, Pietro Mascagni, Armine Vardazaryan, Alain Garcia, Nariaki Okamoto, Guido Costamagna, Didier Mutter, Jacques Marescaux, Bernard Dallemagne, Nicolas Padoy

This technical report provides a detailed overview of Endoscapes, a dataset of laparoscopic cholecystectomy (LC) videos with highly intricate annotations targeted at automated assessment of the Critical View of Safety (CVS).

Anatomy Instance Segmentation +4

Challenges in Multi-centric Generalization: Phase and Step Recognition in Roux-en-Y Gastric Bypass Surgery

1 code implementation18 Dec 2023 Joel L. Lavanchy, Sanat Ramesh, Diego Dall'Alba, Cristians Gonzalez, Paolo Fiorini, Beat Muller-Stich, Philipp C. Nett, Jacques Marescaux, Didier Mutter, Nicolas Padoy

The use of multi-centric training data, experiments 6) and 7), improves the generalization capabilities of the models, bringing them beyond the level of independent mono-centric training and validation (experiments 1) and 2)).

Activity Recognition

Encoding Surgical Videos as Latent Spatiotemporal Graphs for Object and Anatomy-Driven Reasoning

1 code implementation11 Dec 2023 Aditya Murali, Deepak Alapatt, Pietro Mascagni, Armine Vardazaryan, Alain Garcia, Nariaki Okamoto, Didier Mutter, Nicolas Padoy

Recently, spatiotemporal graphs have emerged as a concise and elegant manner of representing video clips in an object-centric fashion, and have shown to be useful for downstream tasks such as action recognition.

Action Recognition Anatomy +3

TRUSTED: The Paired 3D Transabdominal Ultrasound and CT Human Data for Kidney Segmentation and Registration Research

no code implementations19 Oct 2023 William Ndzimbong, Cyril Fourniol, Loic Themyr, Nicolas Thome, Yvonne Keeza, Beniot Sauer, Pierre-Thierry Piechaud, Arnaud Mejean, Jacques Marescaux, Daniel George, Didier Mutter, Alexandre Hostettler, Toby Collins

To validate the dataset's utility, 5 competitive Deep Learning models for automatic kidney segmentation were benchmarked, yielding average DICE scores from 83. 2% to 89. 1% for CT, and 61. 9% to 79. 4% for US images.

Image Registration Image Segmentation +2

Surgical Action Triplet Detection by Mixed Supervised Learning of Instrument-Tissue Interactions

1 code implementation18 Jul 2023 Saurav Sharma, Chinedu Innocent Nwoye, Didier Mutter, Nicolas Padoy

We analyze how the amount of instrument spatial annotations affects triplet detection and observe that accurate instrument localization does not guarantee better triplet detection due to the risk of erroneous associations with the verbs and targets.

Action Triplet Detection Triplet

Weakly Supervised Temporal Convolutional Networks for Fine-grained Surgical Activity Recognition

no code implementations21 Feb 2023 Sanat Ramesh, Diego Dall'Alba, Cristians Gonzalez, Tong Yu, Pietro Mascagni, Didier Mutter, Jacques Marescaux, Paolo Fiorini, Nicolas Padoy

In this work, we propose to use coarser and easier-to-annotate activity labels, namely phases, as weak supervision to learn step recognition with fewer step annotated videos.

Activity Recognition

Preserving Privacy in Surgical Video Analysis Using Artificial Intelligence: A Deep Learning Classifier to Identify Out-of-Body Scenes in Endoscopic Videos

no code implementations17 Jan 2023 Joël L. Lavanchy, Armine Vardazaryan, Pietro Mascagni, AI4SafeChole Consortium, Didier Mutter, Nicolas Padoy

Results: The internal dataset consisting of 356, 267 images from 48 videos and the two multicentric test datasets consisting of 54, 385 and 58, 349 images from 10 and 20 videos, respectively, were annotated.

Latent Graph Representations for Critical View of Safety Assessment

1 code implementation8 Dec 2022 Aditya Murali, Deepak Alapatt, Pietro Mascagni, Armine Vardazaryan, Alain Garcia, Nariaki Okamoto, Didier Mutter, Nicolas Padoy

Assessing the critical view of safety in laparoscopic cholecystectomy requires accurate identification and localization of key anatomical structures, reasoning about their geometric relationships to one another, and determining the quality of their exposure.

Anatomy Graph Neural Network +3

Rendezvous in Time: An Attention-based Temporal Fusion approach for Surgical Triplet Recognition

1 code implementation30 Nov 2022 Saurav Sharma, Chinedu Innocent Nwoye, Didier Mutter, Nicolas Padoy

Focusing more on the verbs, our RiT explores the connectedness of current and past frames to learn temporal attention-based features for enhanced triplet recognition.

Action Triplet Recognition Triplet

Live Laparoscopic Video Retrieval with Compressed Uncertainty

no code implementations8 Mar 2022 Tong Yu, Pietro Mascagni, Juan Verde, Jacques Marescaux, Didier Mutter, Nicolas Padoy

Searching through large volumes of medical data to retrieve relevant information is a challenging yet crucial task for clinical care.

Retrieval Video Retrieval

Rendezvous: Attention Mechanisms for the Recognition of Surgical Action Triplets in Endoscopic Videos

8 code implementations7 Sep 2021 Chinedu Innocent Nwoye, Tong Yu, Cristians Gonzalez, Barbara Seeliger, Pietro Mascagni, Didier Mutter, Jacques Marescaux, Nicolas Padoy

To achieve this task, we introduce our new model, the Rendezvous (RDV), which recognizes triplets directly from surgical videos by leveraging attention at two different levels.

Action Triplet Recognition Triplet

Multi-Task Temporal Convolutional Networks for Joint Recognition of Surgical Phases and Steps in Gastric Bypass Procedures

no code implementations24 Feb 2021 Sanat Ramesh, Diego Dall'Alba, Cristians Gonzalez, Tong Yu, Pietro Mascagni, Didier Mutter, Jacques Marescaux, Paolo Fiorini, Nicolas Padoy

Conclusion: In this work, we present a multi-task multi-stage temporal convolutional network for surgical activity recognition, which shows improved results compared to single-task models on the Bypass40 gastric bypass dataset with multi-level annotations.

Activity Recognition

Weakly Supervised Convolutional LSTM Approach for Tool Tracking in Laparoscopic Videos

1 code implementation4 Dec 2018 Chinedu Innocent Nwoye, Didier Mutter, Jacques Marescaux, Nicolas Padoy

Results: We build a baseline tracker on top of the CNN model and demonstrate that our approach based on the ConvLSTM outperforms the baseline in tool presence detection, spatial localization, and motion tracking by over 5. 0%, 13. 9%, and 12. 6%, respectively.

Instrument Recognition Surgical tool detection +2

Learning from a tiny dataset of manual annotations: a teacher/student approach for surgical phase recognition

1 code implementation30 Nov 2018 Tong Yu, Didier Mutter, Jacques Marescaux, Nicolas Padoy

Vision algorithms capable of interpreting scenes from a real-time video stream are necessary for computer-assisted surgery systems to achieve context-aware behavior.

Online surgical phase recognition

Future-State Predicting LSTM for Early Surgery Type Recognition

no code implementations28 Nov 2018 Siddharth Kannan, Gaurav Yengera, Didier Mutter, Jacques Marescaux, Nicolas Padoy

This work presents a novel approach for the early recognition of the type of a laparoscopic surgery from its video.

Vocal Bursts Type Prediction

Weakly-Supervised Learning for Tool Localization in Laparoscopic Videos

1 code implementation14 Jun 2018 Armine Vardazaryan, Didier Mutter, Jacques Marescaux, Nicolas Padoy

We propose a deep architecture, trained solely on image level annotations, that can be used for both tool presence detection and localization in surgical videos.

Surgical tool detection Weakly-supervised Learning

Less is More: Surgical Phase Recognition with Less Annotations through Self-Supervised Pre-training of CNN-LSTM Networks

no code implementations22 May 2018 Gaurav Yengera, Didier Mutter, Jacques Marescaux, Nicolas Padoy

In this work, we propose a new self-supervised pre-training approach based on the prediction of remaining surgery duration (RSD) from laparoscopic videos.

Management Surgical phase recognition

RSDNet: Learning to Predict Remaining Surgery Duration from Laparoscopic Videos Without Manual Annotations

1 code implementation9 Feb 2018 Andru Putra Twinanda, Gaurav Yengera, Didier Mutter, Jacques Marescaux, Nicolas Padoy

In this paper, we propose a deep learning pipeline, referred to as RSDNet, which automatically estimates the remaining surgery duration (RSD) intraoperatively by using only visual information from laparoscopic videos.

Single- and Multi-Task Architectures for Tool Presence Detection Challenge at M2CAI 2016

no code implementations27 Oct 2016 Andru P. Twinanda, Didier Mutter, Jacques Marescaux, Michel de Mathelin, Nicolas Padoy

The tool presence detection challenge at M2CAI 2016 consists of identifying the presence/absence of seven surgical tools in the images of cholecystectomy videos.

Single- and Multi-Task Architectures for Surgical Workflow Challenge at M2CAI 2016

no code implementations27 Oct 2016 Andru P. Twinanda, Didier Mutter, Jacques Marescaux, Michel de Mathelin, Nicolas Padoy

On top of these architectures we propose to use two different approaches to enforce the temporal constraints of the surgical workflow: (1) HMM-based and (2) LSTM-based pipelines.

Surgical phase recognition

Cannot find the paper you are looking for? You can Submit a new open access paper.