1 code implementation • 4 Aug 2024 • Fushuo Huo, Wenchao Xu, Zhong Zhang, Haozhao Wang, Zhicheng Chen, Peilin Zhao
While Large Vision-Language Models (LVLMs) have rapidly advanced in recent years, the prevalent issue known as the `hallucination' problem has emerged as a significant bottleneck, hindering their real-world deployments.
no code implementations • CVPR 2024 • Fushuo Huo, Wenchao Xu, Jingcai Guo, Haozhao Wang, Song Guo
We empirically reveal that the modality gap i. e. modality imbalance and soft label misalignment incurs the ineffectiveness of traditional KD in CMKD.
1 code implementation • 31 Dec 2023 • Yunfeng Fan, Wenchao Xu, Haozhao Wang, Fushuo Huo, Jinyu Chen, Song Guo
On the other hand, we propose the modality selection aiming to select subsets of local modalities with great diversity and achieving global modal balance simultaneously.
no code implementations • 2 May 2023 • Xiaocheng Lu, Ziming Liu, Song Guo, Jingcai Guo, Fushuo Huo, Sikai Bai, Tao Han
Compositional Zero-shot Learning (CZSL) aims to recognize novel concepts composed of known knowledge without training samples.
no code implementations • 20 Mar 2023 • Fushuo Huo, Wenchao Xu, Jingcai Guo, Haozhao Wang, Yunfeng Fan, Song Guo
In this paper, we propose a novel Dual-prototype Self-augment and Refinement method (DSR) for NO-CL problem, which consists of two strategies: 1) Dual class prototypes: vanilla and high-dimensional prototypes are exploited to utilize the pre-trained information and obtain robust quasi-orthogonal representations rather than example buffers for both privacy preservation and memory reduction.
no code implementations • CVPR 2023 • Ziming Liu, Song Guo, Xiaocheng Lu, Jingcai Guo, Jiewei Zhang, Yue Zeng, Fushuo Huo
Recent studies usually approach multi-label zero-shot learning (MLZSL) with visual-semantic mapping on spatial-class correlation, which can be computationally costly, and worse still, fails to capture fine-grained class-specific semantics.
no code implementations • 19 Nov 2022 • Fushuo Huo, Wenchao Xu, Song Guo, Jingcai Guo, Haozhao Wang, Ziming Liu, Xiaocheng Lu
Open-World Compositional Zero-shot Learning (OW-CZSL) aims to recognize novel compositions of state and object primitives in images with no priors on the compositional space, which induces a tremendously large output space containing all possible state-object compositions.
1 code implementation • 5 Sep 2022 • Bingheng Li, Fushuo Huo
The reason for the range effect is that the predicted deviations both in a wide range and in a narrow range destroy the uniformity between MOS and pMOS.
no code implementations • 7 Mar 2022 • Ziming Liu, Song Guo, Jingcai Guo, Yuanyuan Xu, Fushuo Huo
We argue that disregarding the connection between major and minor classes, i. e., correspond to the global and local information, respectively, is the cause of the problem.