1 code implementation • 19 Sep 2023 • Paul Thomas, Seth Spielman, Nick Craswell, Bhaskar Mitra
It takes careful feedback from real users, which by definition is the highest-quality first-party gold data that can be derived, and develops an large language model prompt that agrees with that data.
no code implementations • 25 Apr 2023 • Amifa Raj, Bhaskar Mitra, Nick Craswell, Michael D. Ekstrand
There are many ways a query, the search results, and a demographic attribute such as gender may relate, leading us to hypothesize different causes for these reformulation patterns, such as under-representation on the original result page or based on the linguistic theory of markedness.
no code implementations • 30 Jan 2023 • Zhenduo Wang, Yuancheng Tu, Corby Rosset, Nick Craswell, Ming Wu, Qingyao Ai
In this work, we innovatively explore generating clarifying questions in a zero-shot setting to overcome the cold start problem and we propose a constrained clarifying question generation system which uses both question templates and query facets to guide the effective and precise question generation.
no code implementations • 26 Jun 2022 • Sebastian Hofstätter, Nick Craswell, Bhaskar Mitra, Hamed Zamani, Allan Hanbury
Recently, several dense retrieval (DR) models have demonstrated competitive performance to term-based retrieval that are ubiquitous in search systems.
2 code implementations • 21 Apr 2022 • Xinyi Yan, Chengxi Luo, Charles L. A. Clarke, Nick Craswell, Ellen M. Voorhees, Pablo Castells
Based on these simulations, one algorithm stands out for its potential.
no code implementations • 21 Jan 2022 • Gabriella Kazai, Bhaskar Mitra, Anlei Dong, Nick Craswell, Linjun Yang
This raises questions about when such summaries are sufficient for relevance estimation by the ranking model or the human assessor, and whether humans and machines benefit from the document's full text in similar ways.
no code implementations • 13 Jan 2022 • Jianfeng Gao, Chenyan Xiong, Paul Bennett, Nick Craswell
A conversational information retrieval (CIR) system is an information retrieval (IR) system with a conversational interface which allows users to interact with the system to seek information via multi-turn conversations of natural language, in spoken or written form.
1 code implementation • 20 May 2021 • Sebastian Hofstätter, Bhaskar Mitra, Hamed Zamani, Nick Craswell, Allan Hanbury
An emerging recipe for achieving state-of-the-art effectiveness in neural document re-ranking involves utilizing large pre-trained language models - e. g., BERT - to evaluate all individual passages in the document and then aggregating the outputs by pooling or additional Transformer layers.
no code implementations • 9 May 2021 • Nick Craswell, Bhaskar Mitra, Emine Yilmaz, Daniel Campos, Jimmy Lin
Evaluation efforts such as TREC, CLEF, NTCIR and FIRE, alongside public leaderboard such as MS MARCO, are intended to encourage research and track our progress, addressing big questions in our field.
no code implementations • 19 Apr 2021 • Nick Craswell, Bhaskar Mitra, Emine Yilmaz, Daniel Campos, Ellen M. Voorhees, Ian Soboroff
The TREC Deep Learning (DL) Track studies ad hoc search in the large data regime, meaning that a large set of human-labeled training data is available.
no code implementations • 19 Apr 2021 • Bhaskar Mitra, Sebastian Hofstatter, Hamed Zamani, Nick Craswell
The Transformer-Kernel (TK) model has demonstrated strong reranking performance on the TREC Deep Learning benchmark -- and can be considered to be an efficient (but slightly less effective) alternative to other Transformer-based architectures that employ (i) large-scale pretraining (high training cost), (ii) joint encoding of query and document (high inference cost), and (iii) larger number of Transformer layers (both high training and high inference costs).
no code implementations • 25 Feb 2021 • Jimmy Lin, Daniel Campos, Nick Craswell, Bhaskar Mitra, Emine Yilmaz
Leaderboards are a ubiquitous part of modern research in applied machine learning.
1 code implementation • 15 Feb 2021 • Nick Craswell, Bhaskar Mitra, Emine Yilmaz, Daniel Campos
This is the second year of the TREC Deep Learning Track, with the goal of studying ad hoc ranking in the large training data regime.
no code implementations • 14 Nov 2020 • Bhaskar Mitra, Sebastian Hofstatter, Hamed Zamani, Nick Craswell
We benchmark Conformer-Kernel models under the strict blind evaluation setting of the TREC 2020 Deep Learning track.
1 code implementation • 20 Jul 2020 • Bhaskar Mitra, Sebastian Hofstatter, Hamed Zamani, Nick Craswell
In this work, we extend the TK architecture to the full retrieval setting by incorporating the query term independence assumption.
no code implementations • 17 Jul 2020 • Bodo Billerbeck, Justin Zobel, Nicholas Lester, Nick Craswell
Search techniques make use of elementary information such as term frequencies and document lengths in computation of similarity weighting.
1 code implementation • 17 Jun 2020 • Hamed Zamani, Gord Lueck, Everest Chen, Rodolfo Quispe, Flint Luu, Nick Craswell
In this paper, we introduce MIMICS, a collection of search clarification datasets for real web search queries sampled from the Bing query logs.
no code implementations • 9 Jun 2020 • Nick Craswell, Daniel Campos, Bhaskar Mitra, Emine Yilmaz, Bodo Billerbeck
Users of Web search engines reveal their information needs through queries and clicks, making click logs a useful asset for information retrieval.
no code implementations • 30 May 2020 • Hamed Zamani, Bhaskar Mitra, Everest Chen, Gord Lueck, Fernando Diaz, Paul N. Bennett, Nick Craswell, Susan T. Dumais
We also propose a model for learning representation for clarifying questions based on the user interaction data as implicit feedback.
1 code implementation • 11 May 2020 • Sebastian Hofstätter, Hamed Zamani, Bhaskar Mitra, Nick Craswell, Allan Hanbury
In this work, we propose a local self-attention which considers a moving window over the document terms and for each term attends only to other terms in the same window.
no code implementations • 28 Apr 2020 • Emine Yilmaz, Nick Craswell, Bhaskar Mitra, Daniel Campos
As deep learning based models are increasingly being used for information retrieval (IR), a major challenge is to ensure the availability of test collections for measuring their quality.
2 code implementations • 17 Mar 2020 • Nick Craswell, Bhaskar Mitra, Emine Yilmaz, Daniel Campos, Ellen M. Voorhees
The Deep Learning Track is a new track for TREC 2019, with the goal of studying ad hoc ranking in a large data regime.
1 code implementation • 18 Dec 2019 • Hamed Zamani, Nick Craswell
Such research will require data and tools, to allow the implementation and study of conversational systems.
1 code implementation • 10 Dec 2019 • Bhaskar Mitra, Nick Craswell
This report discusses three submissions based on the Duet architecture to the Deep Learning track at TREC 2019.
no code implementations • 24 Jul 2019 • Hongfei Zhang, Xia Song, Chenyan Xiong, Corby Rosset, Paul N. Bennett, Nick Craswell, Saurabh Tiwary
This paper presents GEneric iNtent Encoder (GEN Encoder) which learns a distributed representation space for user intent in search.
no code implementations • 8 Jul 2019 • Bhaskar Mitra, Corby Rosset, David Hawking, Nick Craswell, Fernando Diaz, Emine Yilmaz
Deep neural IR models, in contrast, compare the whole query to the document and are, therefore, typically employed only for late stage re-ranking.
no code implementations • 15 Apr 2019 • Corby Rosset, Bhaskar Mitra, Chenyan Xiong, Nick Craswell, Xia Song, Saurabh Tiwary
The training of these models involve a search for appropriate parameter values based on large quantities of labeled examples.
1 code implementation • 18 Mar 2019 • Bhaskar Mitra, Nick Craswell
We propose several small modifications to Duet---a deep neural ranking model---and evaluate the updated model on the MS MARCO passage ranking task.
Ranked #4 on Passage Re-Ranking on MS MARCO
no code implementations • 3 May 2017 • Bhaskar Mitra, Nick Craswell
Neural ranking models for information retrieval (IR) use shallow or deep neural networks to rank search results in response to a query.
12 code implementations • 28 Nov 2016 • Payal Bajaj, Daniel Campos, Nick Craswell, Li Deng, Jianfeng Gao, Xiaodong Liu, Rangan Majumder, Andrew McNamara, Bhaskar Mitra, Tri Nguyen, Mir Rosenberg, Xia Song, Alina Stoica, Saurabh Tiwary, Tong Wang
The size of the dataset and the fact that the questions are derived from real user search queries distinguishes MS MARCO from other well-known publicly available datasets for machine reading comprehension and question-answering.
1 code implementation • Proceedings of the 26th International Conference on World Wide Web, WWW '17 2017 • Bhaskar Mitra, Fernando Diaz, Nick Craswell
Models such as latent semantic analysis and those based on neural embeddings learn distributed representations of text, and match the query against the document in the latent semantic space.
no code implementations • ACL 2016 • Fernando Diaz, Bhaskar Mitra, Nick Craswell
Continuous space word embeddings have received a great deal of attention in the natural language processing and machine learning communities for their ability to model term similarity and other relationships.
no code implementations • 2 Feb 2016 • Bhaskar Mitra, Eric Nalisnick, Nick Craswell, Rich Caruana
A fundamental goal of search engines is to identify, given a query, documents that have relevant text.