Semantic Retrieval
18 papers with code • 1 benchmarks • 2 datasets
Most implemented papers
Variational Transformer: A Framework Beyond the Trade-off between Accuracy and Diversity for Image Captioning
In this work, we will show that the inferior standard of accuracy draws from human annotations (leave-one-out) are not appropriate for machine-generated captions.
PiC: A Phrase-in-Context Dataset for Phrase Understanding and Semantic Search
While contextualized word embeddings have been a de-facto standard, learning contextualized phrase embeddings is less explored and being hindered by the lack of a human-annotated benchmark that tests machine understanding of phrase semantics given a context sentence or paragraph (instead of phrases alone).
Training Effective Neural Sentence Encoders from Automatically Mined Paraphrases
Our sentence encoder can be trained in less than a day on a single graphics card, achieving high performance on a diverse set of sentence-level tasks.
Sentence Representation Learning with Generative Objective rather than Contrastive Objective
Though offering amazing contextualized token-level representations, current pre-trained language models take less attention on accurately acquiring sentence-level representation during their self-supervised pre-training.
Intra-class Adaptive Augmentation with Neighbor Correction for Deep Metric Learning
They have overlooked the wide characteristic changes of different classes and can not model abundant intra-class variations for generations.
Surface-Based Retrieval Reduces Perplexity of Retrieval-Augmented Language Models
Inspired by this, we replace the semantic retrieval in Retro with a surface-level method based on BM25, obtaining a significant reduction in perplexity.
If the Sources Could Talk: Evaluating Large Language Models for Research Assistance in History
We demonstrate that LLMs semantic retrieval and reasoning abilities on problem-specific tasks can be applied to large textual archives that have not been part of the its training data.
M4LE: A Multi-Ability Multi-Range Multi-Task Multi-Domain Long-Context Evaluation Benchmark for Large Language Models
In this paper, we propose M4LE, a Multi-ability, Multi-range, Multi-task, Multi-domain benchmark for Long-context Evaluation.