Search Results for author: Xingwu Sun

Found 11 papers, 3 papers with code

Enhancing Document Ranking with Task-adaptive Training and Segmented Token Recovery Mechanism

no code implementations EMNLP 2021 Xingwu Sun, Yanling Cui, Hongyin Tang, Fuzheng Zhang, Beihong Jin, Shi Wang

In this paper, we propose a new ranking model DR-BERT, which improves the Document Retrieval (DR) task by a task-adaptive training process and a Segmented Token Recovery Mechanism (STRM).

Document Ranking Retrieval

Surge Phenomenon in Optimal Learning Rate and Batch Size Scaling

no code implementations23 May 2024 Shuaipeng Li, Penghao Zhao, Hailin Zhang, Xingwu Sun, Hao Wu, Dian Jiao, Weiyan Wang, Chengjun Liu, Zheng Fang, Jinbao Xue, Yangyu Tao, Bin Cui, Di Wang

First, we raise the scaling law between batch sizes and optimal learning rates in the sign of gradient case, in which we prove that the optimal learning rate first rises and then falls as the batch size increases.

PhD: A Prompted Visual Hallucination Evaluation Dataset

1 code implementation17 Mar 2024 Jiazhen Liu, Yuhan Fu, Ruobing Xie, Runquan Xie, Xingwu Sun, Fengzong Lian, Zhanhui Kang, Xirong Li

The rapid growth of Large Language Models (LLMs) has driven the development of Large Vision-Language Models (LVLMs).

Attribute Common Sense Reasoning +2

Truth Forest: Toward Multi-Scale Truthfulness in Large Language Models through Intervention without Tuning

1 code implementation29 Dec 2023 Zhongzhi Chen, Xingwu Sun, Xianfeng Jiao, Fengzong Lian, Zhanhui Kang, Di Wang, Cheng-Zhong Xu

We introduce Truth Forest, a method that enhances truthfulness in LLMs by uncovering hidden truth representations using multi-dimensional orthogonal probes.

Inflected Forms Are Redundant in Question Generation Models

no code implementations1 Jan 2023 Xingwu Sun, Hongyin Tang, Chengzhong Xu

Secondly, we propose to adapt QG as a combination of the following actions in the encode-decoder framework: generating a question word, copying a word from the source sequence or generating a word transformation type.

Decoder Question Generation +1

TencentPretrain: A Scalable and Flexible Toolkit for Pre-training Models of Different Modalities

3 code implementations13 Dec 2022 Zhe Zhao, Yudong Li, Cheng Hou, Jing Zhao, Rong Tian, Weijie Liu, Yiren Chen, Ningyuan Sun, Haoyan Liu, Weiquan Mao, Han Guo, Weigang Guo, Taiqiang Wu, Tao Zhu, Wenhang Shi, Chen Chen, Shan Huang, Sihong Chen, Liqun Liu, Feifei Li, Xiaoshuai Chen, Xingwu Sun, Zhanhui Kang, Xiaoyong Du, Linlin Shen, Kimmo Yan

The proposed pre-training models of different modalities are showing a rising trend of homogeneity in their model structures, which brings the opportunity to implement different pre-training models within a uniform framework.

Decoder

TITA: A Two-stage Interaction and Topic-Aware Text Matching Model

no code implementations NAACL 2021 Xingwu Sun, Yanling Cui, Hongyin Tang, Qiuyu Zhu, Fuzheng Zhang, Beihong Jin

To tackle this problem, we define a three-level relevance in keyword-document matching task: topic-aware relevance, partially-relevance and irrelevance.

Text Matching Vocal Bursts Valence Prediction

Improving Document Representations by Generating Pseudo Query Embeddings for Dense Retrieval

no code implementations ACL 2021 Hongyin Tang, Xingwu Sun, Beihong Jin, Jingang Wang, Fuzheng Zhang, Wei Wu

Recently, the retrieval models based on dense representations have been gradually applied in the first stage of the document retrieval tasks, showing better performance than traditional sparse vector space models.

Clustering Retrieval

Answer-focused and Position-aware Neural Question Generation

no code implementations EMNLP 2018 Xingwu Sun, Jing Liu, Yajuan Lyu, wei he, Yanjun Ma, Shi Wang

(2) The model copies the context words that are far from and irrelevant to the answer, instead of the words that are close and relevant to the answer.

Machine Reading Comprehension Position +3

Cannot find the paper you are looking for? You can Submit a new open access paper.