Semantic Retrieval
12 papers with code • 1 benchmarks • 2 datasets
Most implemented papers
Revealing the Importance of Semantic Retrieval for Machine Reading at Scale
In this work, we give general guidelines on system design for MRS by proposing a simple yet effective pipeline system with special consideration on hierarchical semantic retrieval at both paragraph and sentence level, and their potential effects on the downstream task.
Semantic query-by-example speech search using visual grounding
A number of recent studies have started to investigate how speech systems can be trained on untranscribed speech by leveraging accompanying images at training time.
Contract Discovery: Dataset and a Few-Shot Semantic Retrieval Challenge with Competitive Baselines
We propose a new shared task of semantic retrieval from legal texts, in which a so-called contract discovery is to be performed, where legal clauses are extracted from documents, given a few examples of similar clauses from other legal acts.
Deep Unsupervised Image Hashing by Maximizing Bit Entropy
This layer is shown to minimize a penalized term of the Wasserstein distance between the learned continuous image features and the optimal half-half bit distribution.
Semantic Models for the First-stage Retrieval: A Comprehensive Review
We believe it is the right time to survey current status, learn from existing methods, and gain some insights for future development.
Evaluation of Audio-Visual Alignments in Visually Grounded Speech Models
We compare the alignment performance using our proposed evaluation metrics to the semantic retrieval task commonly used to evaluate VGS models.
Compressing Sentence Representation for Semantic Retrieval via Homomorphic Projective Distillation
How to learn highly compact yet effective sentence representation?
Variational Transformer: A Framework Beyond the Trade-off between Accuracy and Diversity for Image Captioning
In this work, we will show that the inferior standard of accuracy draws from human annotations (leave-one-out) are not appropriate for machine-generated captions.
PiC: A Phrase-in-Context Dataset for Phrase Understanding and Semantic Search
While contextualized word embeddings have been a de-facto standard, learning contextualized phrase embeddings is less explored and being hindered by the lack of a human-annotated benchmark that tests machine understanding of phrase semantics given a context sentence or paragraph (instead of phrases alone).
Training Effective Neural Sentence Encoders from Automatically Mined Paraphrases
Our sentence encoder can be trained in less than a day on a single graphics card, achieving high performance on a diverse set of sentence-level tasks.