no code implementations • 17 Jul 2024 • Haoyang Wen, Honglei Zhuang, Hamed Zamani, Alexander Hauptmann, Michael Bendersky
Besides, the two-tower architecture also limits the relevance score modeling of a retriever to select top candidates for answer generator reasoning.
1 code implementation • 18 Apr 2024 • Fang Guo, Wenyu Li, Honglei Zhuang, Yun Luo, Yafu Li, Qi Zhu, Le Yan, Yue Zhang
The most recent pointwise Large Language Model (LLM) rankers have achieved remarkable ranking results.
no code implementations • 17 Apr 2024 • Le Yan, Zhen Qin, Honglei Zhuang, Rolf Jagerman, Xuanhui Wang, Michael Bendersky, Harrie Oosterhuis
Our method takes both LLM generated relevance labels and pairwise preferences.
no code implementations • 15 Nov 2023 • Minghan Li, Honglei Zhuang, Kai Hui, Zhen Qin, Jimmy Lin, Rolf Jagerman, Xuanhui Wang, Michael Bendersky
In this paper, we re-examine this conclusion and raise the following question: Can query expansion improve generalization of strong cross-encoder rankers?
no code implementations • 22 Oct 2023 • Andrew Drozdov, Honglei Zhuang, Zhuyun Dai, Zhen Qin, Razieh Rahimi, Xuanhui Wang, Dana Alon, Mohit Iyyer, Andrew McCallum, Donald Metzler, Kai Hui
Recent studies show that large language models (LLMs) can be instructed to effectively perform zero-shot passage re-ranking, in which the results of a first stage retrieval method, such as BM25, are rated and reordered to improve relevance.
no code implementations • 21 Oct 2023 • Honglei Zhuang, Zhen Qin, Kai Hui, Junru Wu, Le Yan, Xuanhui Wang, Michael Bendersky
We propose to incorporate fine-grained relevance labels into the prompt for LLM rankers, enabling them to better differentiate among documents with different levels of relevance to the query and thus derive a more accurate ranking.
1 code implementation • 14 Oct 2023 • Shengyao Zhuang, Honglei Zhuang, Bevan Koopman, Guido Zuccon
We propose a novel zero-shot document ranking approach based on Large Language Models (LLMs): the Setwise prompting approach.
no code implementations • 30 Jun 2023 • Zhen Qin, Rolf Jagerman, Kai Hui, Honglei Zhuang, Junru Wu, Le Yan, Jiaming Shen, Tianqi Liu, Jialu Liu, Donald Metzler, Xuanhui Wang, Michael Bendersky
Ranking documents using Large Language Models (LLMs) by directly feeding the query and candidate documents into the prompt is an interesting and practical problem.
no code implementations • 19 May 2023 • Ronak Pradeep, Kai Hui, Jai Gupta, Adam D. Lelkes, Honglei Zhuang, Jimmy Lin, Donald Metzler, Vinh Q. Tran
Popularized by the Differentiable Search Index, the emerging paradigm of generative retrieval re-frames the classic information retrieval problem into a sequence-to-sequence modeling task, forgoing external indices and encoding an entire document corpus within a single Transformer.
no code implementations • 5 May 2023 • Rolf Jagerman, Honglei Zhuang, Zhen Qin, Xuanhui Wang, Michael Bendersky
Query expansion is a widely used technique to improve the recall of search systems.
no code implementations • 28 Dec 2022 • Yunan Zhang, Le Yan, Zhen Qin, Honglei Zhuang, Jiaming Shen, Xuanhui Wang, Michael Bendersky, Marc Najork
We give both theoretical analysis and empirical results to show the negative effects on relevance tower due to such a correlation.
no code implementations • 12 Oct 2022 • Honglei Zhuang, Zhen Qin, Rolf Jagerman, Kai Hui, Ji Ma, Jing Lu, Jianmo Ni, Xuanhui Wang, Michael Bendersky
Recently, substantial progress has been made in text ranking based on pretrained language models such as BERT.
no code implementations • 11 Oct 2022 • Kai Hui, Tao Chen, Zhen Qin, Honglei Zhuang, Fernando Diaz, Mike Bendersky, Don Metzler
Retrieval augmentation has shown promising improvements in different tasks.
no code implementations • Findings (ACL) 2022 • Kai Hui, Honglei Zhuang, Tao Chen, Zhen Qin, Jing Lu, Dara Bahri, Ji Ma, Jai Prakash Gupta, Cicero Nogueira dos santos, Yi Tay, Don Metzler
This results in significant inference time speedups since the decoder-only architecture only needs to learn to interpret static encoder embeddings during inference.
no code implementations • 17 Dec 2021 • Nan Wang, Zhen Qin, Le Yan, Honglei Zhuang, Xuanhui Wang, Michael Bendersky, Marc Najork
Multiclass classification (MCC) is a fundamental machine learning problem of classifying each instance into one of a predefined set of classes.
3 code implementations • ICLR 2022 • Vamsi Aribandi, Yi Tay, Tal Schuster, Jinfeng Rao, Huaixiu Steven Zheng, Sanket Vaibhav Mehta, Honglei Zhuang, Vinh Q. Tran, Dara Bahri, Jianmo Ni, Jai Gupta, Kai Hui, Sebastian Ruder, Donald Metzler
Despite the recent success of multi-task learning and transfer learning for natural language processing (NLP), few works have systematically studied the effect of scaling up the number of tasks during pre-training.
no code implementations • 30 Sep 2021 • Zhen Qin, Le Yan, Yi Tay, Honglei Zhuang, Xuanhui Wang, Michael Bendersky, Marc Najork
We explore a novel perspective of knowledge distillation (KD) for learning to rank (LTR), and introduce Self-Distilled neural Rankers (SDR), where student rankers are parameterized identically to their teachers.
no code implementations • 29 Sep 2021 • Nan Wang, Zhen Qin, Le Yan, Honglei Zhuang, Xuanhui Wang, Michael Bendersky, Marc Najork
We further demonstrate that the most popular MCC architecture in deep learning can be mathematically formulated as a LTR pipeline equivalently, with a specific set of choices in terms of ranking model architecture and loss function.
1 code implementation • 21 Jan 2021 • Walker Ravina, Ethan Sterling, Olexiy Oryeshko, Nathan Bell, Honglei Zhuang, Xuanhui Wang, Yonghui Wu, Alexander Grushetsky
The goal of model distillation is to faithfully transfer teacher model knowledge to a model which is faster, more generalizable, more interpretable, or possesses other desirable characteristics.
no code implementations • ICLR 2021 • Zhen Qin, Le Yan, Honglei Zhuang, Yi Tay, Rama Kumar Pasumarthi, Xuanhui Wang, Michael Bendersky, Marc Najork
We first validate this concern by showing that most recent neural LTR models are, by a large margin, inferior to the best publicly available Gradient Boosted Decision Trees (GBDT) in terms of their reported ranking accuracy on benchmark datasets.
no code implementations • 3 Dec 2020 • Wen Wang, Honglei Zhuang, Mi Zhou, Hanyu Liu, Beibei Li
Based on these insights, we then propose a hierarchical course BERT model to predict teachers' performance in online education.
no code implementations • 1 Oct 2020 • Michael Bendersky, Honglei Zhuang, Ji Ma, Shuguang Han, Keith Hall, Ryan Mcdonald
In this paper, we report the results of our participation in the TREC-COVID challenge.
no code implementations • 13 May 2020 • Xiaojin Zhang, Honglei Zhuang, Shengyu Zhang, Yuan Zhou
We study a variant of the thresholding bandit problem (TBP) in the context of outlier detection, where the objective is to identify the outliers whose rewards are above a threshold.
no code implementations • 6 May 2020 • Honglei Zhuang, Xuanhui Wang, Michael Bendersky, Alexander Grushetsky, Yonghui Wu, Petr Mitrichev, Ethan Sterling, Nathan Bell, Walker Ravina, Hai Qian
Interpretability of learning-to-rank models is a crucial yet relatively under-examined research area.
no code implementations • 21 Nov 2019 • Yu Meng, Maryam Karimzadehgan, Honglei Zhuang, Donald Metzler
In personal email search, user queries often impose different requirements on different aspects of the retrieved emails.
1 code implementation • NeurIPS 2019 • Yu Meng, Jiaxin Huang, Guangyuan Wang, Chao Zhang, Honglei Zhuang, Lance Kaplan, Jiawei Han
While text embeddings are typically learned in the Euclidean space, directional similarity is often more effective in tasks such as word similarity and document clustering, which creates a gap between the training stage and usage stage of text embedding.
no code implementations • NeurIPS 2017 • Honglei Zhuang, Chi Wang, Yifan Wang
Outlier detection is a powerful method to narrow down the attention to a few objects after the data for them are collected.
no code implementations • EMNLP 2017 • Honglei Zhuang, Chi Wang, Fangbo Tao, Lance Kaplan, Jiawei Han
A document outlier is a document that substantially deviates in semantics from the majority ones in a corpus.
no code implementations • 5 Jun 2017 • Yu Shi, Po-Wei Chan, Honglei Zhuang, Huan Gui, Jiawei Han
We also identify, from real-world data, and propose to model cross-meta-path synergy, which is a characteristic important for defining path-based HIN relevance and has not been modeled by existing methods.