Poly-encoders: Transformer Architectures and Pre-training Strategies for Fast and Accurate Multi-sentence Scoring

The use of deep pre-trained bidirectional transformers has led to remarkable progress in a number of applications (Devlin et al., 2018). For tasks that make pairwise comparisons between sequences, matching a given input with a corresponding label, two approaches are common: Cross-encoders performing full self-attention over the pair and Bi-encoders encoding the pair separately... (read more)

PDF Abstract
TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK RESULT BENCHMARK
Conversational Response Selection DSTC7 Ubuntu Bi-encoder 1-of-100 Accuracy 66.3% # 3
Conversational Response Selection DSTC7 Ubuntu Bi-encoder (v2) 1-of-100 Accuracy 70.9% # 2

Methods used in the Paper