Efficient Dynamic Hard Negative Sampling for Dialogue Selection
Recent studies have demonstrated significant improvements in selection tasks, and a considerable portion of this success is attributed to incorporating informative negative samples during training. While traditional methods for constructing hard negatives provide meaningful supervision, they depend on static samples that do not evolve during training, leading to sub-optimal performance. Dynamic hard negative sampling addresses this limitation by continuously adapting to the model’s changing state throughout training. However, the high computational demands of this method restrict its applicability to certain model architectures. To overcome these challenges, we introduce an efficient dynamic hard negative sampling (EDHNS). EDHNS enhances efficiency by pre-filtering easily discriminable negatives, thereby reducing the number of candidates the model needs to compute during training. Additionally, it excludes question-candidate pairs where the model already exhibits high confidence from loss computations, further reducing training time. These approaches maintain learning quality while minimizing computation and streamlining the training process. Extensive experiments on DSTC9, DSTC10, Ubuntu, and E-commerce benchmarks demonstrate that EDHNS significantly outperforms baseline models, proving its effectiveness in dialogue selection tasks.
PDF AbstractCode
Datasets
Results from the Paper
Task | Dataset | Model | Metric Name | Metric Value | Global Rank | Benchmark |
---|---|---|---|---|---|---|
Conversational Response Selection | E-commerce | BERT-FP+EDHNS | R10@1 | 0.957 | # 1 | |
R10@2 | 0.986 | # 1 | ||||
R10@5 | 0.997 | # 1 | ||||
Conversational Response Selection | Ubuntu Dialogue (v1, Ranking) | BERT-FP+EDHNS | R10@1 | 0.917 | # 2 | |
R10@2 | 0.965 | # 1 | ||||
R10@5 | 0.994 | # 1 |