Ad-hoc information retrieval refers to the task of returning information resources related to a user query formulated in natural language.
An effective way is to extract meaningful matching patterns from words, phrases, and sentences to produce the matching score.
This paper provides a unified account of two schools of thinking in information retrieval modelling: the generative retrieval focusing on predicting relevant documents given a query, and the discriminative retrieval focusing on predicting relevancy given a query-document pair.
Sculley et al. remind us that "the goal of science is not wins, but knowledge".
Ranked #3 on
Ad-Hoc Information Retrieval
on TREC Robust04
(MAP metric)
Given a query and a set of documents, K-NRM uses a translation matrix that models word-level similarities via word embeddings, a new kernel-pooling technique that uses kernels to extract multi-level soft match features, and a learning-to-rank layer that combines those features into the final ranking score.
Neural networks provide new possibilities to automatically learn complex language patterns and query-document relations.
Ranked #5 on
Ad-Hoc Information Retrieval
on TREC Robust04
AD-HOC INFORMATION RETRIEVAL LANGUAGE MODELLING WORD EMBEDDINGS
We call this joint approach CEDR (Contextualized Embeddings for Document Ranking).
Ranked #3 on
Ad-Hoc Information Retrieval
on TREC Robust04
Following recent successes in applying BERT to question answering, we explore simple applications to ad hoc document retrieval.
Ranked #2 on
Ad-Hoc Information Retrieval
on TREC Robust04
(MAP metric)
We explore several new models for document relevance ranking, building upon the Deep Relevance Matching Model (DRMM) of Guo et al. (2016).
Ranked #7 on
Ad-Hoc Information Retrieval
on TREC Robust04
We investigate this observation further by varying target words to probe the model's use of latent knowledge.
Ranked #1 on
Ad-Hoc Information Retrieval
on TREC Robust04
In this work, we propose a standalone neural ranking model (SNRM) by introducing a sparsity property to learn a latent sparse representation for each query and document.
Ranked #12 on
Ad-Hoc Information Retrieval
on TREC Robust04