no code implementations • 2 Jan 2024 • Xuelin Zhu, Jian Liu, Dongqi Tang, Jiawei Ge, Weijia Liu, Bo Liu, Jiuxin Cao
Identifying labels that did not appear during training, known as multi-label zero-shot learning, is a non-trivial task in computer vision.
no code implementations • 7 Dec 2023 • Xuelin Zhu, Jiuxin Cao, Jian Liu, Dongqi Tang, Furong Xu, Weijia Liu, Jiawei Ge, Bo Liu, Qingpei Guo, Tianyi Zhang
Pre-trained vision-language models have notably accelerated progress of open-world concept recognition.
no code implementations • 28 Nov 2023 • Jiawei Ge, Xiangmei Chen, Jiuxin Cao, Xuelin Zhu, Bo Liu
However, current VL trackers have not fully exploited the power of VL learning, as they suffer from limitations such as heavily relying on off-the-shelf backbones for feature extraction, ineffective VL fusion designs, and the absence of VL-related loss functions.
no code implementations • 7 Aug 2023 • Ya Jing, Xuelin Zhu, Xingbin Liu, Qie Sima, Taozheng Yang, Yunhai Feng, Tao Kong
However, the recipes of visual pre-training for robot manipulation tasks are yet to be built.
no code implementations • ICCV 2023 • Xuelin Zhu, Jian Liu, Weijia Liu, Jiawei Ge, Bo Liu, Jiuxin Cao
Multi-label image classification refers to assigning a set of labels for an image.
1 code implementation • ACMMM 2022 • Xuelin Zhu, Jiuxin Cao, Jiawei Ge, Weijia Liu, Bo Liu
Specifically, in each layer of TSFormer, a cross-modal attention module is developed to aggregate visual features from spatial stream into semantic stream and update label semantics via a residual connection.
no code implementations • 3 Jul 2020 • Feifei Huang, Jie Li, Xuelin Zhu
Deep convolution neural network has attracted many attentions in large-scale visual classification task, and achieves significant performance improvement compared to traditional visual analysis methods.