Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks

Because of their superior ability to preserve sequence information over time, Long Short-Term Memory (LSTM) networks, a type of recurrent neural network with a more complex computational unit, have obtained strong results on a variety of sequence modeling tasks. The only underlying LSTM structure that has been explored so far is a linear chain. However, natural language exhibits syntactic properties that would naturally combine words to phrases. We introduce the Tree-LSTM, a generalization of LSTMs to tree-structured network topologies. Tree-LSTMs outperform all existing systems and strong LSTM baselines on two tasks: predicting the semantic relatedness of two sentences (SemEval 2014, Task 1) and sentiment classification (Stanford Sentiment Treebank).

PDF Abstract IJCNLP 2015 PDF IJCNLP 2015 Abstract

Results from the Paper

Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Semantic Similarity SICK LSTM (Tai et al., 2015) MSE 0.2831 # 4
Pearson Correlation 0.8528 # 4
Spearman Correlation 0.7911 # 4
Semantic Similarity SICK Bidirectional LSTM (Tai et al., 2015) MSE 0.2736 # 3
Pearson Correlation 0.8567 # 3
Spearman Correlation 0.7966 # 2
Semantic Similarity SICK Dependency Tree-LSTM (Tai et al., 2015) MSE 0.2532 # 1
Pearson Correlation 0.8676 # 1
Spearman Correlation 0.8083 # 1
Natural Language Inference SNLI CT-LSTM [[Tai et al.2015]] % Test Accuracy 88.0 # 39
Natural Language Inference SNLI LSTM [[Tai et al.2015]] % Test Accuracy 84.9 # 75
Natural Language Inference SNLI 2-layer LSTM [[Tai et al.2015]] % Test Accuracy 86.3 # 56
Sentiment Analysis SST-2 Binary classification 2-layer LSTM[tai2015improved] Accuracy 86.3 # 77
Sentiment Analysis SST-2 Binary classification CT-LSTM[tai2015improved] Accuracy 88.0 # 68
Sentiment Analysis SST-5 Fine-grained classification Constituency Tree-LSTM Accuracy 51.0 # 16