A Decomposable Attention Model for Natural Language Inference

We propose a simple neural architecture for natural language inference. Our approach uses attention to decompose the problem into subproblems that can be solved separately, thus making it trivially parallelizable. On the Stanford Natural Language Inference (SNLI) dataset, we obtain state-of-the-art results with almost an order of magnitude fewer parameters than previous work and without relying on any word-order information. Adding intra-sentence attention that takes a minimum amount of order into account yields further improvements.

PDF Abstract EMNLP 2016 PDF EMNLP 2016 Abstract

Datasets


Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Natural Language Inference SNLI 200D decomposable attention model with intra-sentence attention % Test Accuracy 86.8 # 48
% Train Accuracy 90.5 # 44
Parameters 580k # 4
Natural Language Inference SNLI 200D decomposable attention model % Test Accuracy 86.3 # 56
% Train Accuracy 89.5 # 50
Parameters 380k # 4

Methods


No methods listed for this paper. Add relevant methods here