1 code implementation • 5 May 2025 • Xinjie Zhang, Jintao Guo, Shanshan Zhao, Minghao Fu, Lunhao Duan, Guo-Hua Wang, Qing-Guo Chen, Zhao Xu, Weihua Luo, Kaifu Zhang
Despite their respective successes, these two domains have evolved independently, leading to distinct architectural paradigms: While autoregressive-based architectures have dominated multimodal understanding, diffusion-based models have become the cornerstone of image generation.
1 code implementation • 24 Apr 2025 • Zihan Cheng, Jintao Guo, Jian Zhang, Lei Qi, Luping Zhou, Yinghuan Shi, Yang Gao
To our best knowledge, Mamba-Sea is the first work to explore the generalization of Mamba for medical image segmentation, providing an advanced and promising Mamba-based architecture with strong robustness to domain shifts.
1 code implementation • 16 Dec 2024 • Yayuan Li, Jintao Guo, Lei Qi, Wenbin Li, Yinghuan Shi
To address these issues, we build a mutual guidance mechanism, that introduces an Image-Guided-Text (IGT) component to rectify varying quality of text prompts through image representations, and a Text-Guided-Image (TGI) component to mitigate the anomalous match of image modality through text representations.
1 code implementation • 21 Oct 2024 • Jintao Guo, Lei Qi, Yinghuan Shi, Yang Gao
Existing DG methods primarily rely on convolutional neural networks (CNNs), which inherently learn texture biases due to their limited receptive fields, making them prone to overfitting source domains.
1 code implementation • 18 Mar 2024 • Jintao Guo, Lei Qi, Yinghuan Shi, Yang Gao
In this paper, we study the impact of prior CNN-based augmentation methods on token-based models, revealing their performance is suboptimal due to the lack of incentivizing the model to learn holistic shape information.
no code implementations • 11 Jan 2024 • Na Wang, Lei Qi, Jintao Guo, Yinghuan Shi, Yang Gao
2) From the feature perspective, the simple Tail Interaction module implicitly enhances potential correlations among all samples from all source domains, facilitating the acquisition of domain-invariant representations across multiple domains for the model.
1 code implementation • ICCV 2023 • Jintao Guo, Lei Qi, Yinghuan Shi
Deep Neural Networks have exhibited considerable success in various visual tasks.
1 code implementation • CVPR 2023 • Jintao Guo, Na Wang, Lei Qi, Yinghuan Shi
However, the local operation of the convolution kernel makes the model focus too much on local representations (e. g., texture), which inherently causes the model more prone to overfit to the source domains and hampers its generalization ability.
1 code implementation • 7 Dec 2021 • Jintao Guo, Lei Qi, Yinghuan Shi, Yang Gao
Particularly, the proposed method can generate a variety of data variants to better deal with the overfitting issue.