no code implementations • 23 Nov 2024 • Ming Hu, Kun Yuan, Yaling Shen, Feilong Tang, Xiaohao Xu, Lin Zhou, Wei Li, Ying Chen, Zhongxing Xu, Zelin Peng, Siyuan Yan, Vinkle Srivastav, Diping Song, Tianbin Li, Danli Shi, Jin Ye, Nicolas Padoy, Nassir Navab, Junjun He, ZongYuan Ge
Surgical practice involves complex visual interpretation, procedural skills, and advanced medical knowledge, making surgical vision-language pretraining (VLP) particularly challenging due to this complexity and the limited availability of annotated data.
1 code implementation • 19 Oct 2024 • Siyuan Yan, Zhen Yu, Clare Primiero, Cristina Vico-Alonso, Zhonghua Wang, Litao Yang, Philipp Tschandl, Ming Hu, Gin Tan, Vincent Tang, Aik Beng Ng, David Powell, Paul Bonnington, Simon See, Monika Janda, Victoria Mar, Harald Kittler, H. Peter Soyer, ZongYuan Ge
Here, we introduce PanDerm, a multimodal dermatology foundation model pretrained through self-supervised learning on a dataset of over 2 million real-world images of skin diseases, sourced from 11 clinical institutions across 4 imaging modalities.
1 code implementation • 2 Oct 2024 • Lie Ju, Siyuan Yan, Yukun Zhou, Yang Nan, Xiaodan Xing, Peibo Duan, ZongYuan Ge
We hope this codebase serves as a comprehensive and reproducible benchmark, encouraging further advancements in long-tailed medical image learning.
1 code implementation • 11 Jun 2024 • Ming Hu, Peng Xia, Lin Wang, Siyuan Yan, Feilong Tang, Zhongxing Xu, Yimin Luo, Kaimin Song, Jurgen Leitner, Xuelian Cheng, Jun Cheng, Chi Liu, Kaijing Zhou, ZongYuan Ge
With approximately 285 hours of surgical videos, OphNet is about 20 times larger than the largest existing surgical workflow analysis benchmark.
no code implementations • 18 May 2024 • Ming Hu, Siyuan Yan, Peng Xia, Feilong Tang, Wenxue Li, Peibo Duan, Lin Zhang, ZongYuan Ge
In this paper, we propose a test-time image adaptation method to enhance the accuracy of the model on test data by simultaneously updating and predicting test images.
no code implementations • 4 May 2024 • Siyuan Yan, Cheng Luo, Zhen Yu, ZongYuan Ge
To address this, we propose a plug-and-play feature synthesis method called LDFS (Language-Guided Diverse Feature Synthesis) to synthesize new domain features and improve existing CLIP fine-tuning strategies.
2 code implementations • 5 Jan 2024 • Siyuan Yan, Chi Liu, Zhen Yu, Lie Ju, Dwarikanath Mahapatra, Brigid Betz-Stablein, Victoria Mar, Monika Janda, Peter Soyer, ZongYuan Ge
To address these challenges, we propose a novel DG framework for medical image classification without relying on domain labels, called Prompt-driven Latent Domain Generalization (PLDG).
1 code implementation • 4 Apr 2023 • Siyuan Yan, Chi Liu, Zhen Yu, Lie Ju, Dwarikanath Mahapatrainst, Victoria Mar, Monika Janda, Peter Soyer, ZongYuan Ge
Concretely, EPVT leverages a set of domain prompts, each of which plays as a domain expert, to capture domain-specific knowledge; and a shared prompt for general knowledge over the entire dataset.
1 code implementation • 2 Mar 2023 • Siyuan Yan, Jing Zhang, Nick Barnes
To effectively model the two types of uncertainty, we introduce a Bayesian generative model to simultaneously estimate the posterior distribution of model parameters and its predictions.
no code implementations • CVPR 2023 • Siyuan Yan, Zhen Yu, Xuelin Zhang, Dwarikanath Mahapatra, Shekhar S. Chandra, Monika Janda, Peter Soyer, ZongYuan Ge
We introduce a human-in-the-loop framework in the model training process such that users can observe and correct the model's decision logic when confounding behaviors happen.
no code implementations • 12 Aug 2019 • Dawei Li, Yan Cao, Guoliang Shi, Xin Cai, Yang Chen, Sifan Wang, Siyuan Yan
The proposed method can also facilitate the automatic traits estimation of each single leaf (such as the leaf area, length, and width), which has potential to become a highly effective tool for plant research and agricultural engineering.
no code implementations • 2 Jul 2019 • Dawei Li, Siyuan Yan, Xin Cai, Yan Cao, Sifan Wang
In this paper, we present an integrated filter which comprises a weighted local guided image filter and a weighted spatiotemporal tree filter.