1 code implementation • 20 Jan 2024 • Guangyuan Ma, Xing Wu, Zijia Lin, Songlin Hu
In this study, we aim to shed light on this issue by revealing that masked auto-encoder (MAE) pre-training with enhanced decoding significantly improves the term coverage of input tokens in dense representations, compared to vanilla BERT checkpoints.
1 code implementation • 6 Sep 2023 • Zhenpeng Su, Xing Wu, Wei Zhou, Guangyuan Ma, Songlin Hu
ChatGPT has gained significant interest due to its impressive performance, but people are increasingly concerned about its potential risks, particularly around the detection of AI-generated content (AIGC), which is often difficult for untrained humans to identify.
no code implementations • 16 Aug 2023 • Guangyuan Ma, Xing Wu, Peng Wang, Zijia Lin, Songlin Hu
Concretely, we leverage the capabilities of LLMs for document expansion, i. e. query generation, and effectively transfer expanded knowledge to retrievers using pre-training strategies tailored for passage retrieval.
1 code implementation • 7 Jun 2023 • Zhenpeng Su, Xing Wu, Wei Zhou, Guangyuan Ma, Songlin Hu
Dialogue response selection aims to select an appropriate response from several candidates based on a given user and system utterance history.
Ranked #1 on Conversational Response Selection on E-commerce
1 code implementation • 25 Apr 2023 • Guangyuan Ma, Hongtao Liu, Xing Wu, Wanhui Qian, Zhepeng Lv, Qing Yang, Songlin Hu
Firstly, we introduce the user behavior masking pre-training task to recover the masked user behaviors based on their contextual behaviors.
no code implementations • 20 Apr 2023 • Guangyuan Ma, Xing Wu, Peng Wang, Songlin Hu
Siamese or fully separated dual-encoders are often adopted as basic retrieval architecture in the pre-training and fine-tuning stages for encoding queries and passages into their latent embedding spaces.
no code implementations • 5 Apr 2023 • Xing Wu, Guangyuan Ma, Peng Wang, Meng Lin, Zijia Lin, Fuzheng Zhang, Songlin Hu
As an effective representation bottleneck pretraining technique, the contextual masked auto-encoder utilizes contextual embedding to assist in the reconstruction of passages.
2 code implementations • 19 Dec 2022 • Xing Wu, Guangyuan Ma, Wanhui Qian, Zijia Lin, Songlin Hu
Recently, methods have been developed to improve the performance of dense passage retrieval by using context-supervised pre-training.
2 code implementations • 16 Aug 2022 • Xing Wu, Guangyuan Ma, Meng Lin, Zijia Lin, Zhongyuan Wang, Songlin Hu
Dense passage retrieval aims to retrieve the relevant passages of a query from a large corpus based on dense representations (i. e., vectors) of the query and the passages.