1 code implementation • 1 Jul 2022 • Sanat Ramesh, Vinkle Srivastav, Deepak Alapatt, Tong Yu, Aditya Murali, Luca Sestini, Chinedu Innocent Nwoye, Idris Hamoud, Antoine Fleurentin, Georgios Exarchakis, Alexandros Karargyris, Nicolas Padoy
Correct transfer of these methods to surgery, as described and conducted in this work, leads to substantial performance gains over generic uses of SSL - up to 7% on phase recognition and 20% on tool presence detection - as well as state-of-the-art semi-supervised phase recognition approaches by up to 14%.
We also develop a metrics library, ivtmetrics, for model evaluation on surgical triplets.
Ranked #1 on Action Triplet Recognition on CholecT45
5 code implementations • 10 Apr 2022 • Chinedu Innocent Nwoye, Deepak Alapatt, Tong Yu, Armine Vardazaryan, Fangfang Xia, Zixuan Zhao, Tong Xia, Fucang Jia, Yuxuan Yang, Hao Wang, Derong Yu, Guoyan Zheng, Xiaotian Duan, Neil Getty, Ricardo Sanchez-Matilla, Maria Robu, Li Zhang, Huabin Chen, Jiacheng Wang, Liansheng Wang, Bokai Zhang, Beerend Gerats, Sista Raviteja, Rachana Sathish, Rong Tao, Satoshi Kondo, Winnie Pang, Hongliang Ren, Julian Ronald Abbing, Mohammad Hasan Sarhan, Sebastian Bodenstedt, Nithya Bhasker, Bruno Oliveira, Helena R. Torres, Li Ling, Finn Gaida, Tobias Czempiel, João L. Vilaça, Pedro Morais, Jaime Fonseca, Ruby Mae Egging, Inge Nicole Wijma, Chen Qian, GuiBin Bian, Zhen Li, Velmurugan Balasubramanian, Debdoot Sheet, Imanol Luengo, Yuanbo Zhu, Shuai Ding, Jakob-Anton Aschenbrenner, Nicolas Elini van der Kar, Mengya Xu, Mobarakol Islam, Lalithkumar Seenivasan, Alexander Jenke, Danail Stoyanov, Didier Mutter, Pietro Mascagni, Barbara Seeliger, Cristians Gonzalez, Nicolas Padoy
In this paper, we present the challenge setup and assessment of the state-of-the-art deep learning methods proposed by the participants during the challenge.
Ranked #1 on Action Triplet Recognition on CholecT50 (Challenge) (using extra training data)
With these constraints in mind, we propose FedCy, a federated semi-supervised learning (FSSL) method that combines FL and self-supervised learning to exploit a decentralized dataset of both labeled and unlabeled videos, thereby improving performance on the task of surgical phase recognition.
Searching through large volumes of medical data to retrieve relevant information is a challenging yet crucial task for clinical care.
We then use the obtained instrument masks as pseudo-labels in order to train a per-frame segmentation model; to this aim, we develop a learning-from-noisy-labels architecture, designed to extract a clean supervision signal from these pseudo-labels, leveraging their peculiar noise properties.
no code implementations • 27 Dec 2021 • Deepak Alapatt, Pietro Mascagni, Armine Vardazaryan, Alain Garcia, Nariaki Okamoto, Didier Mutter, Jacques Marescaux, Guido Costamagna, Bernard Dallemagne, Nicolas Padoy
A major obstacle to building models for effective semantic segmentation, and particularly video semantic segmentation, is a lack of large and well annotated datasets.
no code implementations • 21 Oct 2021 • Imanol Luengo, Maria Grammatikopoulou, Rahim Mohammadi, Chris Walsh, Chinedu Innocent Nwoye, Deepak Alapatt, Nicolas Padoy, Zhen-Liang Ni, Chen-Chen Fan, Gui-Bin Bian, Zeng-Guang Hou, Heonjin Ha, Jiacheng Wang, Haojie Wang, Dong Guo, Lu Wang, Guotai Wang, Mobarakol Islam, Bharat Giddwani, Ren Hongliang, Theodoros Pissas, Claudio Ravasio, Martin Huber, Jeremy Birch, Joan M. Nunez Do Rio, Lyndon Da Cruz, Christos Bergeles, Hongyu Chen, Fucang Jia, Nikhil KumarTomar, Debesh Jha, Michael A. Riegler, Pal Halvorsen, Sophia Bano, Uddhav Vaghela, Jianyuan Hong, Haili Ye, Feihong Huang, Da-Han Wang, Danail Stoyanov
In 2020, we released pixel-wise semantic annotations for anatomy and instruments for 4670 images sampled from 25 videos of the CATARACTS training set.
no code implementations • 30 Sep 2021 • Martin Wagner, Beat-Peter Müller-Stich, Anna Kisilenko, Duc Tran, Patrick Heger, Lars Mündermann, David M Lubotsky, Benjamin Müller, Tornike Davitashvili, Manuela Capek, Annika Reinke, Tong Yu, Armine Vardazaryan, Chinedu Innocent Nwoye, Nicolas Padoy, Xinyang Liu, Eung-Joo Lee, Constantin Disch, Hans Meine, Tong Xia, Fucang Jia, Satoshi Kondo, Wolfgang Reiter, Yueming Jin, Yonghao Long, Meirui Jiang, Qi Dou, Pheng Ann Heng, Isabell Twick, Kadir Kirtac, Enes Hosgor, Jon Lindström Bolmgren, Michael Stenzel, Björn von Siemens, Hannes G. Kenngott, Felix Nickel, Moritz von Frankenberg, Franziska Mathis-Ullrich, Lena Maier-Hein, Stefanie Speidel, Sebastian Bodenstedt
PURPOSE: Surgical workflow and skill analysis are key technologies for the next generation of cognitive surgical assistance systems.
1 code implementation • 29 Sep 2021 • Alexandros Karargyris, Renato Umeton, Micah J. Sheller, Alejandro Aristizabal, Johnu George, Srini Bala, Daniel J. Beutel, Victor Bittorf, Akshay Chaudhari, Alexander Chowdhury, Cody Coleman, Bala Desinghu, Gregory Diamos, Debo Dutta, Diane Feddema, Grigori Fursin, Junyi Guo, Xinyuan Huang, David Kanter, Satyananda Kashyap, Nicholas Lane, Indranil Mallick, Pietro Mascagni, Virendra Mehta, Vivek Natarajan, Nikola Nikolov, Nicolas Padoy, Gennady Pekhimenko, Vijay Janapa Reddi, G Anthony Reina, Pablo Ribalta, Jacob Rosenthal, Abhishek Singh, Jayaraman J. Thiagarajan, Anna Wuest, Maria Xenochristou, Daguang Xu, Poonam Yadav, Michael Rosenthal, Massimo Loda, Jason M. Johnson, Peter Mattson
Medical AI has tremendous potential to advance healthcare by supporting the evidence-based practice of medicine, personalizing patient treatment, reducing costs, and improving provider and patient experience.
To achieve this task, we introduce our new model, the Rendezvous (RDV), which recognizes triplets directly from surgical videos by leveraging attention at two different levels.
Ranked #1 on Action Triplet Recognition on CholecT50
Second, to address the domain shift and the lack of annotations, we propose a novel unsupervised domain adaptation method, called AdaptOR, to adapt a model from an in-the-wild labeled source domain to a statistically different unlabeled target domain.
Minimally invasive image-guided surgery heavily relies on vision.
no code implementations • 6 Apr 2021 • Pietro Mascagni, Maria Rita Rodriguez-Luna, Takeshi Urade, Emanuele Felli, Patrick Pessaux, Didier Mutter, Jacques Marescaux, Guido Costamagna, Bernard Dallemagne, Nicolas Padoy
The primary endpoint was to compare the rate of CVS achievement between LCs performed in the year before and the year after the 5-second rule.
3-D pose estimation of instruments is a crucial step towards automatic scene understanding in robotic minimally invasive surgery.
Conclusion: In this work, we present a multi-task multi-stage temporal convolutional network for surgical activity recognition, which shows improved results compared to single-task models on the Bypass40 gastric bypass dataset with multi-level annotations.
no code implementations • 30 Oct 2020 • Lena Maier-Hein, Matthias Eisenmann, Duygu Sarikaya, Keno März, Toby Collins, Anand Malpani, Johannes Fallert, Hubertus Feussner, Stamatia Giannarou, Pietro Mascagni, Hirenkumar Nakawala, Adrian Park, Carla Pugh, Danail Stoyanov, Swaroop S. Vedula, Kevin Cleary, Gabor Fichtinger, Germain Forestier, Bernard Gibaud, Teodor Grantcharov, Makoto Hashizume, Doreen Heckmann-Nötzel, Hannes G. Kenngott, Ron Kikinis, Lars Mündermann, Nassir Navab, Sinan Onogur, Raphael Sznitman, Russell H. Taylor, Minu D. Tizabi, Martin Wagner, Gregory D. Hager, Thomas Neumuth, Nicolas Padoy, Justin Collins, Ines Gockel, Jan Goedeke, Daniel A. Hashimoto, Luc Joyeux, Kyle Lam, Daniel R. Leff, Amin Madani, Hani J. Marcus, Ozanan Meireles, Alexander Seitel, Dogu Teber, Frank Ückert, Beat P. Müller-Stich, Pierre Jannin, Stefanie Speidel
We further complement this technical perspective with (4) a review of currently available SDS products and the translational progress from academia and (5) a roadmap for faster clinical translation and exploitation of the full potential of SDS, based on an international multi-round Delphi process.
Deep neural networks power most recent successes of artificial intelligence, spanning from self-driving cars to computer aided diagnosis in radiology and pathology.
Human pose estimation (HPE) is a key building block for developing AI-based context-aware systems inside the operating room (OR).
2D/3D human pose estimation is needed to develop novel intelligent tools for the operating room that can analyze and support the clinical activities.
Recognition of surgical activity is an essential component to develop context-aware decision support for the operating room.
Ranked #1 on Action Triplet Recognition on CholecT40
Data-driven computational approaches have evolved to enable extraction of information from medical images with a reliability, accuracy and speed which is already transforming their interpretation and exploitation in clinical practice.
Results: We build a baseline tracker on top of the CNN model and demonstrate that our approach based on the ConvLSTM outperforms the baseline in tool presence detection, spatial localization, and motion tracking by over 5. 0%, 13. 9%, and 12. 6%, respectively.
Ranked #1 on Surgical tool detection on Cholec80
Vision algorithms capable of interpreting scenes from a real-time video stream are necessary for computer-assisted surgery systems to achieve context-aware behavior.
Methods: We propose a comparison of 6 state-of-the-art face detectors on clinical data using Multi-View Operating Room Faces (MVOR-Faces), a dataset of operating room images capturing real surgical activities.
This work presents a novel approach for the early recognition of the type of a laparoscopic surgery from its video.
In this paper, we present the dataset, its annotations, as well as baseline results from several recent person detection and 2D/3D pose estimation methods.
We propose a deep architecture, trained solely on image level annotations, that can be used for both tool presence detection and localization in surgical videos.
Ranked #3 on Surgical tool detection on Cholec80
In this work, we propose a new self-supervised pre-training approach based on the prediction of remaining surgery duration (RSD) from laparoscopic videos.
As 2D poses are collected at test time using a SV pose detector, which might generate inaccurate detections, we model its characteristics and incorporate this information during training.
In this paper, we propose a deep learning pipeline, referred to as RSDNet, which automatically estimates the remaining surgery duration (RSD) intraoperatively by using only visual information from laparoscopic videos.
In this paper, we propose an approach for multi-view 3D human pose estimation from RGB-D images and demonstrate the benefits of using the additional depth channel for pose refinement beyond its use for the generation of improved features.
On top of these architectures we propose to use two different approaches to enforce the temporal constraints of the surgical workflow: (1) HMM-based and (2) LSTM-based pipelines.
The tool presence detection challenge at M2CAI 2016 consists of identifying the presence/absence of seven surgical tools in the images of cholecystectomy videos.
Proposed methods for the operating room (OR) rely either on foreground estimation using a multi-camera system, which is a challenge in real ORs due to color similarities and frequent illumination changes, or on wearable sensors or markers, which are invasive and therefore difficult to introduce in the room.
In the literature, two types of features are typically used to perform this task: visual features and tool usage signals.
Ranked #4 on Surgical tool detection on Cholec80