no code implementations • 18 Mar 2025 • Zongyun Zhang, Jiacheng Ruan, Xian Gao, Ting Liu, Yuzhuo Fu
Additionally, we contribute to the first multi-modal industrial anomaly detection training dataset, named Defect Detection Question Answering (DDQA), encompassing a wide range of defect types and industrial scenarios.
no code implementations • 11 Mar 2025 • Xian Gao, Jiacheng Ruan, Jingsheng Gao, Ting Liu, Yuzhuo Fu
In this paper, we address this challenge by proposing ReviewAgents, a framework that leverages large language models (LLMs) to generate academic paper reviews.
1 code implementation • 10 Mar 2025 • Jiacheng Ruan, Wenzhen Yuan, Xian Gao, Ye Guo, Daoxin Zhang, Zhe Xu, Yao Hu, Ting Liu, Yuzhuo Fu
Specifically, process RMs evaluate each reasoning step, outcome RMs focus on the assessment of reasoning results, and critique RMs perform error analysis on the entire reasoning process, followed by corrections.
no code implementations • 9 Mar 2025 • Xian Gao, Jiacheng Ruan, Jingsheng Gao, Mingye Xie, Zongyun Zhang, Ting Liu, Yuzhuo Fu
Analyzing student behavior in educational scenarios is crucial for enhancing teaching quality and student engagement.
1 code implementation • 16 Oct 2024 • Jiacheng Ruan, Yebin Yang, Zehao Lin, Yuchen Feng, Feiyu Xiong, Zeyun Tang, Zhiyu Li
Based on this, we introduce the Flow Text with Image Insertion Benchmark (FTII-Bench), which includes 318 high-quality Chinese image-text news articles and 307 high-quality English image-text news articles, covering 10 different news domains.
1 code implementation • 13 Oct 2024 • Jiacheng Ruan, Xian Gao, Suncheng Xiang, Mingye Xie, Ting Liu, Yuzhuo Fu
Parameter-efficient tuning (PET) techniques calibrate the model's predictions on downstream tasks by freezing the pre-trained models and introducing a small number of learnable parameters.
1 code implementation • 24 Sep 2024 • Jiacheng Ruan, Wenzhen Yuan, Zehao Lin, Ning Liao, Zhiyu Li, Feiyu Xiong, Ting Liu, Yuzhuo Fu
CamObj-Instruct is collected for fine-tuning the LVLMs with improved instruction-following capabilities, and it includes 11, 363 images and 68, 849 conversations with diverse instructions.
2 code implementations • 24 Jun 2024 • Tong Zhu, Xiaoye Qu, Daize Dong, Jiacheng Ruan, Jingqi Tong, Conghui He, Yu Cheng
Motivated by this limit, we investigate building MoE models from existing dense large language models.
1 code implementation • 17 Jun 2024 • Tong Zhu, Daize Dong, Xiaoye Qu, Jiacheng Ruan, Wenliang Chen, Yu Cheng
Mixture-of-Experts (MoE) models have shown remarkable capability in instruction tuning, especially when the number of tasks scales.
1 code implementation • 23 Mar 2024 • Jiacheng Ruan, Jingsheng Gao, Mingye Xie, Daize Dong, Suncheng Xiang, Ting Liu, Yuzhuo Fu
Adapter-Tuning (AT) method involves freezing a pre-trained model and introducing trainable adapter modules to acquire downstream knowledge, thereby calibrating the model for better adaptation to downstream tasks.
4 code implementations • 4 Feb 2024 • Jiacheng Ruan, Jincheng Li, Suncheng Xiang
To our best knowledge, this is the first medical image segmentation model constructed based on the pure SSM-based model.
1 code implementation • 4 Jan 2024 • Zeyu Li, Suncheng Xiang, Tong Yu, Jingsheng Gao, Jiacheng Ruan, Yanping Hu, Ting Liu, Yuzhuo Fu
While audio retrieval tasks are well-established in general audio classification, they have not been explored in the context of underwater audio recognition.
1 code implementation • 28 Dec 2023 • Jiacheng Ruan, Jingsheng Gao, Mingye Xie, Suncheng Xiang
Specifically, our block performs a Fourier transform on the three axes of the input features and assigns the external weight in the frequency domain, which is generated by our External Weights Generator.
1 code implementation • 13 Dec 2023 • Jingsheng Gao, Jiacheng Ruan, Suncheng Xiang, Zefang Yu, Ke Ji, Mingye Xie, Ting Liu, Yuzhuo Fu
We conduct experiments on 11 downstream vision datasets and demonstrate that our method significantly improves the performance of existing multi-modal prompt learning models in few-shot scenarios, exhibiting an average accuracy improvement of 2. 31(\%) compared to the state-of-the-art methods on 16 shots.
1 code implementation • 12 Dec 2023 • Jiacheng Ruan, Jingsheng Gao, Mingye Xie, Suncheng Xiang, Zefang Yu, Ting Liu, Yuzhuo Fu
2) They neglect the interaction between the intrinsic task-agnostic knowledge of pre-trained models and the task-specific knowledge in downstream tasks.
1 code implementation • 17 Jul 2023 • Jiacheng Ruan, Mingye Xie, Jingsheng Gao, Ting Liu, Yuzhuo Fu
Moreover, to our best knowledge, this is the first model with a parameter count limited to just 50KB.
1 code implementation • 19 Apr 2023 • Suncheng Xiang, Jingsheng Gao, Mengyuan Guan, Jiacheng Ruan, Chengfeng Zhou, Ting Liu, Dahong Qian, Yuzhuo Fu
In this paper, we propose a Multi-Modal Equivalent Transformer called MMET for more robust visual-semantic embedding learning on visual, textual and visual-textual tasks respectively.
Generalizable Person Re-identification
Representation Learning
1 code implementation • 3 Nov 2022 • Jiacheng Ruan, Suncheng Xiang, Mingye Xie, Ting Liu, Yuzhuo Fu
To address this challenge, we propose a light-weight model to achieve competitive performances for skin lesion segmentation at the lowest cost of parameters and computational complexity so far.
1 code implementation • 25 Oct 2022 • Jiacheng Ruan, Mingye Xie, Suncheng Xiang, Ting Liu, Yuzhuo Fu
Specifically, our block performs a Fourier transform on the three axes of the input feature and assigns the external weight in the frequency domain, which is generated by our Weights Generator.
no code implementations • 14 Sep 2020 • Jiacheng Ruan, Jiahao Li
As a common method in Machine Learning, Ensemble Method is used to train multiple models from a data set and obtain better results through certain combination strategies.