no code implementations • 1 Dec 2024 • Diandian Guo, Cong Cao, Fangfang Yuan, Yanbing Liu, Guangjie Zeng, Xiaoyan Yu, Hao Peng, Philip S. Yu
Experimental results demonstrate the superiority of MICL on benchmark datasets, along with the analyses showcasing MICL's advancement in mitigating the effect of spurious correlation.
no code implementations • 2 Nov 2024 • Diandian Guo, Cong Cao, Fangfang Yuan, Dakui Wang, Wei Ma, Yanbing Liu, Jianhui Fu
The experiments show that our approach outperforms existing methods on popular datasets, providing preliminary evidence for the analogical reasoning capability of MLLM.
1 code implementation • 20 Aug 2024 • Diandian Guo, Weixin Si, Zhixi Li, Jialun Pei, Pheng-Ann Heng
Pringle maneuver (PM) in laparoscopic liver resection aims to reduce blood loss and provide a clear surgical view by intermittently blocking blood inflow of the liver, whereas prolonged PM may cause ischemic injury.
no code implementations • 14 Apr 2024 • Diandian Guo, Manxi Lin, Jialun Pei, He Tang, Yueming Jin, Pheng-Ann Heng
A comprehensive understanding of surgical scenes allows for monitoring of the surgical process, reducing the occurrence of accidents and enhancing efficiency for medical professionals.
1 code implementation • 22 Feb 2024 • Jialun Pei, Diandian Guo, Jingyang Zhang, Manxi Lin, Yueming Jin, Pheng-Ann Heng
In this study, we introduce a novel single-stage bi-modal transformer framework for SGG in the OR, termed S^2Former-OR, aimed to complementally leverage multi-view 2D scenes and 3D point clouds for SGG in an end-to-end manner.
1 code implementation • CVPR 2024 • Diandian Guo, Deng-Ping Fan, Tongyu Lu, Christos Sakaridis, Luc van Gool
The estimation of implicit cross-frame correspondences and the high computational cost have long been major challenges in video semantic segmentation (VSS) for driving scenes.
no code implementations • 23 Jun 2023 • George Eskandar, Shuai Zhang, Mohamed Abdelsamad, Mark Youssef, Diandian Guo, Bin Yang
Data efficiency, or the ability to generalize from a few labeled data, remains a major challenge in deep learning.
1 code implementation • 16 May 2023 • George Eskandar, Diandian Guo, Karim Guirguis, Bin Yang
Second, in contrast to previous works which employ one discriminator that overfits the target domain semantic distribution, we employ a discriminator for the whole image and multiscale discriminators on the image patches.