no code implementations • COLING 2022 • Bo Liu, Wandi Xu, Yuejia Xiang, XiaoJun Wu, Lejian He, BoWen Zhang, Li Zhu
However, we find that noise learning in text classification is relatively underdeveloped: 1. many methods that have been proven effective in the image domain are not explored in text classification, 2. it is difficult to conduct a fair comparison between previous studies as they do experiments in different noise settings.
no code implementations • 26 Jan 2024 • XiaoJun Wu, Dixiang Zhang, Ruyi Gan, Junyu Lu, Ziwei Wu, Renliang Sun, Jiaxing Zhang, Pingjian Zhang, Yan Song
Recent advancements in text-to-image models have significantly enhanced image generation capabilities, yet a notable gap of open-source models persists in bilingual or Chinese language support.
no code implementations • 30 Dec 2023 • Zeyang Zhang, Hui Li, Tianyang Xu, XiaoJun Wu, Josef Kittler
We focus on Infrared-Visible image registration and fusion task (IVRF).
no code implementations • 8 Dec 2023 • Junyu Lu, Ruyi Gan, Dixiang Zhang, XiaoJun Wu, Ziwei Wu, Renliang Sun, Jiaxing Zhang, Pingjian Zhang, Yan Song
During the instruction fine-tuning stage, we introduce semantic-aware visual feature extraction, a crucial method that enables the model to extract informative features from concrete visual objects.
Ranked #1 on Image Captioning on nocaps entire
no code implementations • 7 Dec 2023 • Ruyi Gan, XiaoJun Wu, Junyu Lu, Yuanhe Tian, Dixiang Zhang, Ziwei Wu, Renliang Sun, Chang Liu, Jiaxing Zhang, Pingjian Zhang, Yan Song
However, there are few specialized models in certain domains, such as interior design, which is attributed to the complex textual descriptions and detailed visual elements inherent in design, alongside the necessity for adaptable resolution.
no code implementations • 1 Dec 2023 • Yuxin Li, Qiang Han, Mengying Yu, Yuxin Jiang, Chaikiat Yeo, Yiheng Li, Zihang Huang, Nini Liu, Hsuanhan Chen, XiaoJun Wu
3D object detection in Bird's-Eye-View (BEV) space has recently emerged as a prevalent approach in the field of autonomous driving.
no code implementations • 6 Nov 2023 • Ruyi Gan, Ziwei Wu, Renliang Sun, Junyu Lu, XiaoJun Wu, Dixiang Zhang, Kunhao Pan, Ping Yang, Qi Yang, Jiaxing Zhang, Yan Song
Although many such issues are addressed along the line of research on LLMs, an important yet practical limitation is that many studies overly pursue enlarging model sizes without comprehensively analyzing and optimizing the use of pre-training data in their learning process, as well as appropriate organization and leveraging of such data in training LLMs under cost-effective settings.
no code implementations • 12 Oct 2023 • Junyu Lu, Dixiang Zhang, XiaoJun Wu, Xinyu Gao, Ruyi Gan, Jiaxing Zhang, Yan Song, Pingjian Zhang
Recent advancements enlarge the capabilities of large language models (LLMs) in zero-shot image-to-text generation and understanding by integrating multi-modal inputs.
no code implementations • 15 Sep 2023 • Jianghu Shen, XiaoJun Wu
In the field of remote sensing, we often utilize oriented bounding boxes (OBB) to bound the objects.
1 code implementation • 18 May 2023 • Ziheng Chen, Yue Song, Gaowen Liu, Ramana Rao Kompella, XiaoJun Wu, Nicu Sebe
Besides, our framework offers a novel intrinsic explanation for the most popular LogEig classifier in existing SPD networks.
no code implementations • 16 Feb 2023 • Wenjie Zhang, Xiaoning Song, ZhenHua Feng, Tianyang Xu, XiaoJun Wu
Specifically, associating natural language words that fill the masked token with semantic relation labels (\textit{e. g.} \textit{``org:founded\_by}'') is difficult.
no code implementations • 5 Nov 2022 • Zhe Liu, Yun Li, Lina Yao, Xiaojun Chang, Wei Fang, XiaoJun Wu, Yi Yang
We design Semantic Attention (SA) and generative Knowledge Disentanglement (KD) to learn the dependence of feasibility and contextuality, respectively.
1 code implementation • 7 Sep 2022 • Jiaxing Zhang, Ruyi Gan, Junjie Wang, Yuxiang Zhang, Lin Zhang, Ping Yang, Xinyu Gao, Ziwei Wu, Xiaoqun Dong, Junqing He, Jianheng Zhuo, Qi Yang, Yongfeng Huang, Xiayu Li, Yanghan Wu, Junyu Lu, Xinyu Zhu, Weifeng Chen, Ting Han, Kunhao Pan, Rui Wang, Hao Wang, XiaoJun Wu, Zhongshen Zeng, Chongpei Chen
We hope that this project will be the foundation of Chinese cognitive intelligence.
no code implementations • 29 Jul 2021 • Yu Fu, Tianyang Xu, XiaoJun Wu, Josef Kittler
In this paper, we propose a Patch Pyramid Transformer(PPT) to effectively address the above issues. Specifically, we first design a Patch Transformer to transform the image into a sequence of patches, where transformer encoding is performed for each patch to extract local representations.
no code implementations • journal 2019 • Lingteng Qiu, XiaoJun Wu, ZHIYANG YU
Our method is composed of a segmentation stage (stage 1), a detection stage (stage 2), and a matting stage (stage 3).