Search Results for author: Xiaoyan Cai

Found 14 papers, 5 papers with code

Instruction-Aligned Visual Attention for Mitigating Hallucinations in Large Vision-Language Models

no code implementations24 Mar 2025 Bin Li, Dehong Gao, Yeyuan Wang, Linbo Jin, Shanqing Yu, Xiaoyan Cai, Libin Yang

Despite the significant success of Large Vision-Language models(LVLMs), these models still suffer hallucinations when describing images, generating answers that include non-existent objects.

MME TextVQA

Enhancing Fine-Grained Vision-Language Pretraining with Negative Augmented Samples

no code implementations13 Dec 2024 Yeyuan Wang, Dehong Gao, Lei Yi, Linbo Jin, Jinxia Zhang, Libin Yang, Xiaoyan Cai

Existing Vision-Language Pretraining (VLP) methods have achieved remarkable improvements across a variety of vision-language tasks, confirming their effectiveness in capturing coarse-grained semantic correlations.

MoDULA: Mixture of Domain-Specific and Universal LoRA for Multi-Task Learning

no code implementations10 Dec 2024 Yufei Ma, Zihan Liang, Huangyu Dai, Ben Chen, Dehong Gao, Zhuoran Ran, Wang Zihan, Linbo Jin, Wen Jiang, Guannan Zhang, Xiaoyan Cai, Libin Yang

Here, we propose MoDULA (\textbf{M}ixture \textbf{o}f \textbf{D}omain-Specific and \textbf{U}niversal \textbf{L}oR\textbf{A}), a novel \textbf{P}arameter \textbf{E}fficient \textbf{F}ine-\textbf{T}uning (PEFT) \textbf{M}ixture-\textbf{o}f-\textbf{E}xpert (MoE) paradigm for improved fine-tuning and parameter efficiency in multi-task learning.

Multi-Task Learning

MLoRA: Multi-Domain Low-Rank Adaptive Network for CTR Prediction

no code implementations14 Aug 2024 Zhiming Yang, Haining Gao, Dehong Gao, Luwei Yang, Libin Yang, Xiaoyan Cai, Wei Ning, Guannan Zhang

In this paper, we propose a Multi-domain Low-Rank Adaptive network (MLoRA) for CTR prediction, where we introduce a specialized LoRA module for each domain.

Click-Through Rate Prediction Prediction

General2Specialized LLMs Translation for E-commerce

no code implementations6 Mar 2024 Kaidi Chen, Ben Chen, Dehong Gao, Huangyu Dai, Wen Jiang, Wei Ning, Shanqing Yu, Libin Yang, Xiaoyan Cai

Existing Neural Machine Translation (NMT) models mainly handle translation in the general domain, while overlooking domains with special writing formulas, such as e-commerce and legal documents.

Machine Translation NMT +1

An Iterative Optimizing Framework for Radiology Report Summarization with ChatGPT

2 code implementations17 Apr 2023 Chong Ma, Zihao Wu, Jiaqi Wang, Shaochen Xu, Yaonai Wei, Fang Zeng, Zhengliang Liu, Xi Jiang, Lei Guo, Xiaoyan Cai, Shu Zhang, Tuo Zhang, Dajiang Zhu, Dinggang Shen, Tianming Liu, Xiang Li

The 'Impression' section of a radiology report is a critical basis for communication between radiologists and other physicians, and it is typically written by radiologists based on the 'Findings' section.

In-Context Learning

A Skeleton-Based Model for Promoting Coherence Among Sentences in Narrative Story Generation

1 code implementation EMNLP 2018 Jingjing Xu, Xuancheng Ren, Yi Zhang, Qi Zeng, Xiaoyan Cai, Xu sun

Compared to the state-of-the-art models, our skeleton-based model can generate significantly more coherent text according to human evaluation and automatic evaluation.

Reinforcement Learning Sentence +1

Deep Stacking Networks for Low-Resource Chinese Word Segmentation with Transfer Learning

no code implementations4 Nov 2017 Jingjing Xu, Xu sun, Sujian Li, Xiaoyan Cai, Bingzhen Wei

In this paper, we propose a deep stacking framework to improve the performance on word segmentation tasks with insufficient data by integrating datasets from diverse domains.

Chinese Word Segmentation Transfer Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.