no code implementations • 30 Jun 2023 • Zhen Qin, Rolf Jagerman, Kai Hui, Honglei Zhuang, Junru Wu, Jiaming Shen, Tianqi Liu, Jialu Liu, Donald Metzler, Xuanhui Wang, Michael Bendersky
On TREC-DL2019, PRP is only inferior to the GPT-4 solution on the NDCG@5 and NDCG@10 metrics, while outperforming other existing solutions, such as InstructGPT which has 175B parameters, by over 10% for nearly all ranking metrics.
no code implementations • 7 Jun 2023 • Zhen Qin, Rolf Jagerman, Rama Pasumarthi, Honglei Zhuang, He Zhang, Aijun Bai, Kai Hui, Le Yan, Xuanhui Wang
The distillation of ranking models has become an important topic in both academia and industry.
no code implementations • 19 May 2023 • Ronak Pradeep, Kai Hui, Jai Gupta, Adam D. Lelkes, Honglei Zhuang, Jimmy Lin, Donald Metzler, Vinh Q. Tran
Popularized by the Differentiable Search Index, the emerging paradigm of generative retrieval re-frames the classic information retrieval problem into a sequence-to-sequence modeling task, forgoing external indices and encoding an entire document corpus within a single Transformer.
no code implementations • 21 Dec 2022 • Ruicheng Xian, Honglei Zhuang, Zhen Qin, Hamed Zamani, Jing Lu, Ji Ma, Kai Hui, Han Zhao, Xuanhui Wang, Michael Bendersky
Domain adaptation aims to transfer the knowledge acquired by models trained on (data-rich) source domains to (low-resource) target domains, for which a popular method is invariant representation learning.
1 code implementation • 15 Dec 2022 • Bernd Bohnet, Vinh Q. Tran, Pat Verga, Roee Aharoni, Daniel Andor, Livio Baldini Soares, Massimiliano Ciaramita, Jacob Eisenstein, Kuzman Ganchev, Jonathan Herzig, Kai Hui, Tom Kwiatkowski, Ji Ma, Jianmo Ni, Lierni Sestorain Saralegui, Tal Schuster, William W. Cohen, Michael Collins, Dipanjan Das, Donald Metzler, Slav Petrov, Kellie Webster
We take human annotations as a gold standard and show that a correlated automatic metric is suitable for development.
no code implementations • 12 Oct 2022 • Honglei Zhuang, Zhen Qin, Rolf Jagerman, Kai Hui, Ji Ma, Jing Lu, Jianmo Ni, Xuanhui Wang, Michael Bendersky
Recently, substantial progress has been made in text ranking based on pretrained language models such as BERT.
no code implementations • 11 Oct 2022 • Kai Hui, Tao Chen, Zhen Qin, Honglei Zhuang, Fernando Diaz, Mike Bendersky, Don Metzler
Retrieval augmentation has shown promising improvements in different tasks.
no code implementations • Findings (ACL) 2022 • Kai Hui, Honglei Zhuang, Tao Chen, Zhen Qin, Jing Lu, Dara Bahri, Ji Ma, Jai Prakash Gupta, Cicero Nogueira dos santos, Yi Tay, Don Metzler
This results in significant inference time speedups since the decoder-only architecture only needs to learn to interpret static encoder embeddings during inference.
1 code implementation • 14 Feb 2022 • Yi Tay, Vinh Q. Tran, Mostafa Dehghani, Jianmo Ni, Dara Bahri, Harsh Mehta, Zhen Qin, Kai Hui, Zhe Zhao, Jai Gupta, Tal Schuster, William W. Cohen, Donald Metzler
In this paper, we demonstrate that information retrieval can be accomplished with a single Transformer, in which all information about the corpus is encoded in the parameters of the model.
3 code implementations • ICLR 2022 • Vamsi Aribandi, Yi Tay, Tal Schuster, Jinfeng Rao, Huaixiu Steven Zheng, Sanket Vaibhav Mehta, Honglei Zhuang, Vinh Q. Tran, Dara Bahri, Jianmo Ni, Jai Gupta, Kai Hui, Sebastian Ruder, Donald Metzler
Despite the recent success of multi-task learning and transfer learning for natural language processing (NLP), few works have systematically studied the effect of scaling up the number of tasks during pre-training.
no code implementations • 18 Apr 2021 • Kai Hui, Klaus Berberich
In this work, we collect judgments from multiple judges using a crowdsourcing platform and aggregate them to compare the two kinds of preference judgments in terms of transitivity, time consumption, and quality.
no code implementations • 17 Apr 2021 • Xiaoyang Chen, Kai Hui, Ben He, Xianpei Han, Le Sun, Zheng Ye
BERT-based text ranking models have dramatically advanced the state-of-the-art in ad-hoc retrieval, wherein most models tend to consider individual query-document pairs independently.
3 code implementations • 16 Sep 2020 • Xuanang Chen, Ben He, Kai Hui, Le Sun, Yingfei Sun
Despite the effectiveness of utilizing the BERT model for document ranking, the high computational cost of such approaches limits their uses.
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Zhi Zheng, Kai Hui, Ben He, Xianpei Han, Le Sun, Andrew Yates
Query expansion aims to mitigate the mismatch between the language used in a query and in a document.
1 code implementation • EMNLP 2018 • Canjia Li, Yingfei Sun, Ben He, Le Wang, Kai Hui, Andrew Yates, Le Sun, Jungang Xu
Pseudo-relevance feedback (PRF) is commonly used to boost the performance of traditional information retrieval (IR) models by using top-ranked documents to identify and weight new query terms, thereby reducing the effect of query-document vocabulary mismatches.
Ranked #9 on
Ad-Hoc Information Retrieval
on TREC Robust04
no code implementations • ACL 2018 • Cancan Jin, Ben He, Kai Hui, Le Sun
Existing automated essay scoring (AES) models rely on rated essays for the target prompt as training data.
1 code implementation • 1 Jul 2017 • Sean MacAvaney, Andrew Yates, Kai Hui, Ophir Frieder
One challenge with neural ranking is the need for a large amount of manually-labeled relevance judgments for training.
3 code implementations • 30 Jun 2017 • Kai Hui, Andrew Yates, Klaus Berberich, Gerard de Melo
Neural IR models, such as DRMM and PACRR, have achieved strong results by successfully capturing relevance matching signals.
no code implementations • 27 Jun 2017 • Andrew Yates, Kai Hui
Recent neural IR models have demonstrated deep learning's utility in ad-hoc information retrieval.
3 code implementations • EMNLP 2017 • Kai Hui, Andrew Yates, Klaus Berberich, Gerard de Melo
In order to adopt deep learning for information retrieval, models are needed that can capture all relevant information required to assess the relevance of a document to a given user query.