Improved Sentence Modeling using Suffix Bidirectional LSTM

18 May 2018  ·  Siddhartha Brahma ·

Recurrent neural networks have become ubiquitous in computing representations of sequential data, especially textual data in natural language processing. In particular, Bidirectional LSTMs are at the heart of several neural models achieving state-of-the-art performance in a wide variety of tasks in NLP. However, BiLSTMs are known to suffer from sequential bias - the contextual representation of a token is heavily influenced by tokens close to it in a sentence. We propose a general and effective improvement to the BiLSTM model which encodes each suffix and prefix of a sequence of tokens in both forward and reverse directions. We call our model Suffix Bidirectional LSTM or SuBiLSTM. This introduces an alternate bias that favors long range dependencies. We apply SuBiLSTMs to several tasks that require sentence modeling. We demonstrate that using SuBiLSTM instead of a BiLSTM in existing models leads to improvements in performance in learning general sentence representations, text classification, textual entailment and paraphrase detection. Using SuBiLSTM we achieve new state-of-the-art results for fine-grained sentiment classification and question classification.

PDF Abstract

Results from the Paper

Task Dataset Model Metric Name Metric Value Global Rank Uses Extra
Training Data
Result Benchmark
Sentiment Analysis CR SuBiLSTM-Tied Accuracy 86.5 # 5
Sentiment Analysis MR SuBiLSTM-Tied Accuracy 81.6 # 7
Sentiment Analysis SST-2 Binary classification Suffix BiLSTM Accuracy 91.2 # 51
Sentiment Analysis SST-5 Fine-grained classification BCN+Suffix BiLSTM-Tied+CoVe Accuracy 56.2 # 4