no code implementations • 29 Feb 2024 • Juexiao Feng, Yuhong Yang, Yanchun Xie, Yaqian Li, Yandong Guo, Yuchen Guo, Yuwei He, Liuyu Xiang, Guiguang Ding
In recent years, object detection in deep learning has experienced rapid development.
no code implementations • 2 Feb 2024 • Haoxiang Gao, Zhongruo Wang, Yaqian Li, Kaiwen Long, Ming Yang, Yiqing Shen
The advent of foundation models has revolutionized the fields of natural language processing and computer vision, paving the way for their application in autonomous driving (AD).
1 code implementation • 9 Nov 2023 • Jinjin Xu, Liwu Xu, Yuzhe Yang, Xiang Li, Fanyi Wang, Yanchun Xie, Yi-Jie Huang, Yaqian Li
Recent advancements in multi-modal large language models (MLLMs) have led to substantial improvements in visual understanding, primarily driven by sophisticated modality alignment strategies.
2 code implementations • 23 Oct 2023 • Xinyu Huang, Yi-Jie Huang, Youcai Zhang, Weiwei Tian, Rui Feng, Yuejie Zhang, Yanchun Xie, Yaqian Li, Lei Zhang
Specifically, for predefined commonly used tag categories, RAM++ showcases 10. 2 mAP and 15. 4 mAP enhancements over CLIP on OpenImages and ImageNet.
no code implementations • 29 Aug 2023 • Xuwei Tan, Yi-Jie Huang, Yaqian Li
Instead of "opening set", i. e., modeling OOD distribution, Prototype Fission "closes set" and makes it hard for OOD samples to fit in sub-class latent space.
no code implementations • 28 Jul 2023 • Liwu Xu, Jinjin Xu, Yuzhe Yang, YiJie Huang, Yanchun Xie, Yaqian Li
Specifically, we first integrate and leverage a multi-source unlabeled dataset to align rich features between a given visual encoder and an off-the-shelf CLIP image encoder via feature alignment loss.
no code implementations • journal 2023 • Wenming Zhang, Qikai Zhu, Yaqian Li, Haibin Li
First, in order to expand the receptive field of the feature map and effectively extract the regional features of the target object with shape distortion in the feature map, we propose the Malformed Attention Module (MAM).
2 code implementations • 6 Jun 2023 • Youcai Zhang, Xinyu Huang, Jinyu Ma, Zhaoyang Li, Zhaochuan Luo, Yanchun Xie, Yuzhuo Qin, Tong Luo, Yaqian Li, Shilong Liu, Yandong Guo, Lei Zhang
We are releasing the RAM at \url{https://recognize-anything. github. io/} to foster the advancements of large models in computer vision.
1 code implementation • CVPR 2023 • Mengyao Lyu, Jundong Zhou, Hui Chen, YiJie Huang, Dongdong Yu, Yaqian Li, Yandong Guo, Yuchen Guo, Liuyu Xiang, Guiguang Ding
Active learning selects informative samples for annotation within budget, which has proven efficient recently on object detection.
1 code implementation • 15 Mar 2023 • Youcai Zhang, Yuzhuo Qin, Hengwei Liu, Yanhao Zhang, Yaqian Li, Xiaodong Gu
Knowledge distillation (KD) has been extensively studied in single-label image classification.
2 code implementations • 10 Mar 2023 • Xinyu Huang, Youcai Zhang, Jinyu Ma, Weiwei Tian, Rui Feng, Yuejie Zhang, Yaqian Li, Yandong Guo, Lei Zhang
This paper presents Tag2Text, a vision language pre-training (VLP) framework, which introduces image tagging into vision-language models to guide the learning of visual-linguistic features.
2 code implementations • journal 2023 • Zhaoqing Wang, Ziyu Chen, Yaqian Li, Yandong Guo, Jun Yu, Mingming Gong, Tongliang Liu
To address this problem, we propose a mosaic representation learning framework (MosRep), consisting of a new data augmentation strategy that enriches the backgrounds of each small crop and improves the quality of visual representations.
1 code implementation • 12 Jul 2022 • Xinyu Huang, Youcai Zhang, Ying Cheng, Weiwei Tian, RuiWei Zhao, Rui Feng, Yuejie Zhang, Yaqian Li, Yandong Guo, Xiaobo Zhang
However, the image-text pairs co-occurrent on the Internet typically lack explicit alignment information, which is suboptimal for VLP.
no code implementations • 24 Jun 2022 • Yiqing Shen, Liwu Xu, Yuzhe Yang, Yaqian Li, Yandong Guo
Mixed Sample Regularization (MSR), such as MixUp or CutMix, is a powerful data augmentation strategy to generalize convolutional neural networks.
no code implementations • CVPR 2022 • Yuzhe Yang, Liwu Xu, Leida Li, Nan Qie, Yaqian Li, Peng Zhang, Yandong Guo
To solve the dilemma, we conduct so far, the most comprehensive subjective study of personalized image aesthetics and introduce a new Personalized image Aesthetics database with Rich Attributes (PARA), which consists of 31, 220 images with annotations by 438 subjects.
1 code implementation • CVPR 2022 • Yiqing Shen, Liwu Xu, Yuzhe Yang, Yaqian Li, Yandong Guo
Afterwards, the former half mini-batch distills on-the-fly soft targets generated in the previous iteration.
2 code implementations • 13 Dec 2021 • Youcai Zhang, Yuhao Cheng, Xinyu Huang, Fei Wen, Rui Feng, Yaqian Li, Yandong Guo
Multi-label learning in the presence of missing labels (MLML) is a challenging problem.
no code implementations • 29 Sep 2021 • Haizhou Shi, Youcai Zhang, Zijin Shen, Siliang Tang, Yaqian Li, Yandong Guo, Yueting Zhuang
This paper investigates the feasibility of federated representation learning under the constraints of communication cost and privacy protection.
no code implementations • 30 Jul 2021 • Haizhou Shi, Youcai Zhang, Siliang Tang, Wenjie Zhu, Yaqian Li, Yandong Guo, Yueting Zhuang
It is a consensus that small models perform quite poorly under the paradigm of self-supervised contrastive learning.