no code implementations • 20 May 2025 • Wenhui Zhu, Xuanzhao Dong, Xin Li, Peijie Qiu, Xiwen Chen, Abolfazl Razi, Aris Sotiras, Yi Su, Yalin Wang
Recently, reinforcement learning (RL)-based tuning has shifted the trajectory of Multimodal Large Language Models (MLLMs), particularly following the introduction of Group Relative Policy Optimization (GRPO).
1 code implementation • 14 May 2025 • Xiwen Chen, Wenhui Zhu, Peijie Qiu, Xuanzhao Dong, Hao Wang, Haiyu Wu, Huayu Li, Aristeidis Sotiras, Yalin Wang, Abolfazl Razi
Our method integrates seamlessly with both GRPO and its variant DR.~GRPO, resulting in $\textit{DRA-GRPO}$ and $\textit{DGA-DR.~GRPO}$.
no code implementations • 9 May 2025 • Xiwen Chen, Wenhui Zhu, Peijie Qiu, Hao Wang, Huayu Li, Zihan Li, Yalin Wang, Aristeidis Sotiras, Abolfazl Razi
However, there is a large consensus that time series data often suffers from domain shifts between training and test sets, which dramatically degrades the classification performance.
1 code implementation • 30 Apr 2025 • Xuanzhao Dong, Wenhui Zhu, Hao Wang, Xiwen Chen, Peijie Qiu, Rui Yin, Yi Su, Yalin Wang
Medical question answering (QA) is a reasoning-intensive task that remains challenging for large language models (LLMs) due to hallucinations and outdated domain knowledge.
1 code implementation • 21 Apr 2025 • Wenhui Zhu, Peijie Qiu, Xiwen Chen, Zhangsihao Yang, Aristeidis Sotiras, Abolfazl Razi, Yalin Wang
Due to the gigapixel resolution of WSI, applications of MIL in WSI typically necessitate a two-stage training scheme: first, extract features from the pre-trained backbone and then perform MIL aggregation.
no code implementations • 1 Apr 2025 • Yujian Xiong, Xuanzhao Dong, Sebastian Waz, Wenhui Zhu, Negar Mallak, Zhong-Lin Lu, Yalin Wang
Ultra-high-field (7 Tesla) BOLD fMRI offers exceptional detail in both spatial and temporal domains, along with robust signal-to-noise characteristics, making it a powerful modality for studying visual information processing in the brain.
1 code implementation • 11 Mar 2025 • Xiwen Chen, Wenhui Zhu, Peijie Qiu, Hao Wang, Huayu Li, Haiyu Wu, Aristeidis Sotiras, Yalin Wang, Abolfazl Razi
Vision-language models (VLMs) such as CLIP demonstrate strong performance but struggle when adapted to downstream tasks.
no code implementations • 20 Feb 2025 • Wenhui Zhu, Xuanzhao Dong, Xin Li, Yujian Xiong, Xiwen Chen, Peijie Qiu, Vamsi Krishna Vasa, Zhangsihao Yang, Yi Su, Oana Dumitrascu, Yalin Wang
To this end, we propose a novel comprehensive benchmark, EyeBench, to provide insights that align enhancement models with clinical needs, offering a foundation for future work to improve the clinical relevance and applicability of generative models for fundus image enhancement.
no code implementations • 6 Jan 2025 • Xiwen Chen, Peijie Qiu, Wenhui Zhu, Huayu Li, Hao Wang, Aristeidis Sotiras, Yalin Wang, Abolfazl Razi
Since its introduction, the transformer has shifted the development trajectory away from traditional models (e. g., RNN, MLP) in time series forecasting, which is attributed to its ability to capture global dependencies within temporal tokens.
no code implementations • 29 Dec 2024 • Peijie Qiu, Wenhui Zhu, Sayantan Kumar, Xiwen Chen, Xiaotong Sun, Jin Yang, Abolfazl Razi, Yalin Wang, Aristeidis Sotiras
Previous attempts at multimodal VAEs approach this mainly through the lens of experts, aggregating unimodal inference distributions with a product of experts (PoE), a mixture of experts (MoE), or a combination of both.
1 code implementation • 23 Dec 2024 • Yue Deng, Yan Yu, Weiyu Ma, ZiRui Wang, Wenhui Zhu, Jian Zhao, Yin Zhang
SMAC-HARD supports customizable opponent strategies, randomization of adversarial policies, and interfaces for MARL self-play, enabling agents to generalize to varying opponent behaviors and improve model stability.
no code implementations • 18 Dec 2024 • Wenchao Xu, Jinyu Chen, Peirong Zheng, Xiaoquan Yi, Tianyi Tian, Wenhui Zhu, Quan Wan, Haozhao Wang, Yunfeng Fan, Qinliang Su, Xuemin Shen
Foundation model (FM) powered agent services are regarded as a promising solution to develop intelligent and personalized applications for advancing toward Artificial General Intelligence (AGI).
1 code implementation • 3 Dec 2024 • Hao Wang, Wenhui Zhu, Xuanzhao Dong, Yanxi Chen, Xin Li, Peijie Qiu, Xiwen Chen, Vamsi Krishna Vasa, Yujian Xiong, Oana M. Dumitrascu, Abolfazl Razi, Yalin Wang
In this work, we propose Many-MobileNet, an efficient model fusion strategy for retinal disease classification using lightweight CNN architecture.
1 code implementation • 3 Nov 2024 • Xuanzhao Dong, Wenhui Zhu, Xin Li, Guoxin Sun, Yi Su, Oana M. Dumitrascu, Yalin Wang
Retinal fundus photography enhancement is important for diagnosing and monitoring retinal diseases.
1 code implementation • 19 Oct 2024 • Xin Li, Wenhui Zhu, Xuanzhao Dong, Oana M. Dumitrascu, Yalin Wang
The rise of Vision Transformer (ViT) has effectively compensated for this deficiency of CNNs and promoted the application of ViT-based U-networks in medical image segmentation.
1 code implementation • 13 Oct 2024 • Vamsi Krishna Vasa, Wenhui Zhu, Xiwen Chen, Peijie Qiu, Xuanzhao Dong, Yalin Wang
In particular, deep neural networks based on a U-shaped architecture (UNet) with skip connections have been adopted for several medical imaging tasks, including organ segmentation.
1 code implementation • 17 Sep 2024 • Xuanzhao Dong, Vamsi Krishna Vasa, Wenhui Zhu, Peijie Qiu, Xiwen Chen, Yi Su, Yujian Xiong, Zhangsihao Yang, Yanxi Chen, Yalin Wang
In this work, we leverage the SB framework to propose an image-to-image translation pipeline for retinal image enhancement.
no code implementations • 12 Sep 2024 • Vamsi Krishna Vasa, Peijie Qiu, Wenhui Zhu, Yujian Xiong, Oana Dumitrascu, Yalin Wang
Retinal fundus photography offers a non-invasive way to diagnose and monitor a variety of retinal diseases, but is prone to inherent quality glitches arising from systemic imperfections or operator/patient-related factors.
1 code implementation • 2 Sep 2024 • Zhangsihao Yang, Mengyi Shan, Mohammad Farazi, Wenhui Zhu, Yanxi Chen, Xuanzhao Dong, Yalin Wang
Human video generation task has gained significant attention with the advancement of deep generative models.
1 code implementation • 17 Jul 2024 • Hao Wang, Wenhui Zhu, Jiayou Qin, Xin Li, Oana Dumitrascu, Xiwen Chen, Peijie Qiu, Abolfazl Razi
Detecting retinal image analysis, particularly the geometrical features of branching points, plays an essential role in diagnosing eye diseases.
1 code implementation • 4 Jul 2024 • Wenhui Zhu, Xiwen Chen, Peijie Qiu, Aristeidis Sotiras, Abolfazl Razi, Yalin Wang
Second, we propose two mechanisms to enforce the diversity among the global vectors to be more descriptive of the entire bag: (i) positive instance alignment and (ii) a novel, efficient, and theoretically guaranteed diversification learning paradigm.
1 code implementation • 21 Jun 2024 • Wenhui Zhu, Xiwen Chen, Peijie Qiu, Mohammad Farazi, Aristeidis Sotiras, Abolfazl Razi, Yalin Wang
Although numerous follow-up studies have also been dedicated to improving the performance of standard UNet, few have conducted in-depth analyses of the underlying interest pattern of UNet in medical image segmentation.
Ranked #17 on
Medical Image Segmentation
on Synapse multi-organ CT
3 code implementations • 6 May 2024 • Xiwen Chen, Peijie Qiu, Wenhui Zhu, Huayu Li, Hao Wang, Aristeidis Sotiras, Yalin Wang, Abolfazl Razi
Deep neural networks, including transformers and convolutional neural networks, have significantly improved multivariate time series classification (MTSC).
no code implementations • 5 May 2024 • Xiwen Chen, Wenhui Zhu, Peijie Qiu, Abolfazl Razi
We theoretically demonstrate the convergence of the MA framework, which has a similar complexity with reconstruction under the known forward model parameters.
no code implementations • 7 Apr 2024 • Yujian Xiong, Wenhui Zhu, Zhong-Lin Lu, Yalin Wang
The reconstruction of human visual inputs from brain activity, particularly through functional Magnetic Resonance Imaging (fMRI), holds promising avenues for unraveling the mechanisms of the human visual system.
1 code implementation • 6 Mar 2024 • Hao Wang, Sayed Pedram Haeri Boroujeni, Xiwen Chen, Ashish Bastola, Huayu Li, Wenhui Zhu, Abolfazl Razi
Specifically, the fusion of Perlin noise in this work significantly improved the quality of synthesized images.
1 code implementation • 31 Oct 2023 • Peijie Qiu, Pan Xiao, Wenhui Zhu, Yalin Wang, Aristeidis Sotiras
Typical MIL methods include a feature embedding part, which embeds the instances into features via a pre-trained feature extractor, and an MIL aggregator that combines instance embeddings into predictions.
1 code implementation • 19 Aug 2023 • Wenhui Zhu, Peijie Qiu, Xiwen Chen, Oana M. Dumitrascu, Yalin Wang
Multiple instance learning (MIL) was a weakly supervised learning approach that sought to assign binary class labels to collections of instances known as bags.
Multiple Instance Learning
Weakly Supervised Classification
+3
2 code implementations • 2 Jun 2023 • Wenhui Zhu, Peijie Qiu, Xiwen Chen, Xin Li, Natasha Lepore, Oana M. Dumitrascu, Yalin Wang
Over the past few decades, convolutional neural networks (CNNs) have been at the forefront of the detection and tracking of various retinal diseases (RD).
no code implementations • 31 Mar 2023 • Jianfeng Wu, Yi Su, Yanxi Chen, Wenhui Zhu, Eric M. Reiman, Richard J. Caselli, Kewei Chen, Paul M. Thompson, Junwen Wang, Yalin Wang
Objective: To build a surface-based model to 1) detect differences between APOE subgroups in patterns of tau deposition and hippocampal atrophy, and 2) use the extracted surface-based features to predict cognitive decline.
no code implementations • 8 Feb 2023 • Mohammad Farazi, Zhangsihao Yang, Wenhui Zhu, Peijie Qiu, Yalin Wang
Our results show the superiority of our LBO-based convolution layer and adapted pooling over the conventionally used unitary cortical thickness, graph Laplacian, and point cloud representation.
3 code implementations • 6 Feb 2023 • Wenhui Zhu, Peijie Qiu, Oana M. Dumitrascu, Jacob M. Sobczak, Mohammad Farazi, Zhangsihao Yang, Keshav Nandakumar, Yalin Wang
Non-mydriatic retinal color fundus photography (CFP) is widely available due to the advantage of not requiring pupillary dilation, however, is prone to poor quality due to operators, systemic imperfections, or patient-related causes.
1 code implementation • 6 Feb 2023 • Wenhui Zhu, Peijie Qiu, Mohammad Farazi, Keshav Nandakumar, Oana M. Dumitrascu, Yalin Wang
In this paper, we proposed a simple but effective end-to-end framework for enhancing poor-quality retinal fundus images.
no code implementations • 28 Oct 2022 • Jianfeng Wu, Yi Su, Wenhui Zhu, Negar Jalili Mallak, Natasha Lepore, Eric M. Reiman, Richard J. Caselli, Paul M. Thompson, Kewei Chen, Yalin Wang
Experimental results suggest that amyloid/tau measurements predicted with our PASCP-MP representations are closer to the real values than the measures derived from other approaches, such as hippocampal surface area, volume, and shape morphometry features based on spherical harmonics (SPHARM).
no code implementations • 17 Oct 2022 • Mohammad Farazi, Wenhui Zhu, Zhangsihao Yang, Yalin Wang
This paper studies 3D dense shape correspondence, a key shape analysis application in computer vision and graphics.
no code implementations • 12 Oct 2022 • Wenhui Zhu, Peijie Qiu, Natasha Lepore, Oana M. Dumitrascu, Yalin Wang
Lesion appearance is a crucial clue for medical providers to distinguish referable diabetic retinopathy (rDR) from non-referable DR.
no code implementations • 20 Oct 2021 • Jianfeng Wu, Wenhui Zhu, Yi Su, Jie Gui, Natasha Lepore, Eric M. Reiman, Richard J. Caselli, Paul M. Thompson, Kewei Chen, Yalin Wang
We evaluate our framework on 925 subjects from the Alzheimer's Disease Neuroimaging Initiative (ADNI).