Search Results for author: Guangyuan Ma

Found 10 papers, 6 papers with code

Drop your Decoder: Pre-training with Bag-of-Word Prediction for Dense Passage Retrieval

1 code implementation20 Jan 2024 Guangyuan Ma, Xing Wu, Zijia Lin, Songlin Hu

In this study, we aim to shed light on this issue by revealing that masked auto-encoder (MAE) pre-training with enhanced decoding significantly improves the term coverage of input tokens in dense representations, compared to vanilla BERT checkpoints.

Decoder Passage Retrieval +2

HC3 Plus: A Semantic-Invariant Human ChatGPT Comparison Corpus

1 code implementation6 Sep 2023 Zhenpeng Su, Xing Wu, Wei Zhou, Guangyuan Ma, Songlin Hu

ChatGPT has gained significant interest due to its impressive performance, but people are increasingly concerned about its potential risks, particularly around the detection of AI-generated content (AIGC), which is often difficult for untrained humans to identify.

Question Answering

Pre-training with Large Language Model-based Document Expansion for Dense Passage Retrieval

no code implementations16 Aug 2023 Guangyuan Ma, Xing Wu, Peng Wang, Zijia Lin, Songlin Hu

Concretely, we leverage the capabilities of LLMs for document expansion, i. e. query generation, and effectively transfer expanded knowledge to retrievers using pre-training strategies tailored for passage retrieval.

Contrastive Learning Language Modelling +3

Dial-MAE: ConTextual Masked Auto-Encoder for Retrieval-based Dialogue Systems

1 code implementation7 Jun 2023 Zhenpeng Su, Xing Wu, Wei Zhou, Guangyuan Ma, Songlin Hu

Dialogue response selection aims to select an appropriate response from several candidates based on a given user and system utterance history.

Conversational Response Selection Decoder +3

PUNR: Pre-training with User Behavior Modeling for News Recommendation

1 code implementation25 Apr 2023 Guangyuan Ma, Hongtao Liu, Xing Wu, Wanhui Qian, Zhepeng Lv, Qing Yang, Songlin Hu

Firstly, we introduce the user behavior masking pre-training task to recover the masked user behaviors based on their contextual behaviors.

News Recommendation Unsupervised Pre-training

CoT-MoTE: Exploring ConTextual Masked Auto-Encoder Pre-training with Mixture-of-Textual-Experts for Passage Retrieval

no code implementations20 Apr 2023 Guangyuan Ma, Xing Wu, Peng Wang, Songlin Hu

Siamese or fully separated dual-encoders are often adopted as basic retrieval architecture in the pre-training and fine-tuning stages for encoding queries and passages into their latent embedding spaces.

Passage Retrieval Retrieval

CoT-MAE v2: Contextual Masked Auto-Encoder with Multi-view Modeling for Passage Retrieval

no code implementations5 Apr 2023 Xing Wu, Guangyuan Ma, Peng Wang, Meng Lin, Zijia Lin, Fuzheng Zhang, Songlin Hu

As an effective representation bottleneck pretraining technique, the contextual masked auto-encoder utilizes contextual embedding to assist in the reconstruction of passages.

Passage Retrieval Retrieval +1

Query-as-context Pre-training for Dense Passage Retrieval

2 code implementations19 Dec 2022 Xing Wu, Guangyuan Ma, Wanhui Qian, Zijia Lin, Songlin Hu

Recently, methods have been developed to improve the performance of dense passage retrieval by using context-supervised pre-training.

Contrastive Learning Passage Retrieval +1

ConTextual Masked Auto-Encoder for Dense Passage Retrieval

2 code implementations16 Aug 2022 Xing Wu, Guangyuan Ma, Meng Lin, Zijia Lin, Zhongyuan Wang, Songlin Hu

Dense passage retrieval aims to retrieve the relevant passages of a query from a large corpus based on dense representations (i. e., vectors) of the query and the passages.

Decoder Passage Retrieval +2

Cannot find the paper you are looking for? You can Submit a new open access paper.