Many deep learning architectures have been proposed to model the
compositionality in text sequences, requiring a substantial number of
parameters and expensive computations. However, there has not been a rigorous
evaluation regarding the added value of sophisticated compositional functions.
In this paper, we conduct a point-by-point comparative study between Simple
Word-Embedding-based Models (SWEMs), consisting of parameter-free pooling
operations, relative to word-embedding-based RNN/CNN models. Surprisingly,
SWEMs exhibit comparable or even superior performance in the majority of cases
considered. Based upon this understanding, we propose two additional pooling
strategies over learned word embeddings: (i) a max-pooling operation for
improved interpretability; and (ii) a hierarchical pooling operation, which
preserves spatial (n-gram) information within text sequences. We present
experiments on 17 datasets encompassing three tasks: (i) (long) document
classification; (ii) text sequence matching; and (iii) short text tasks,
including classification and tagging. The source code and datasets can be
obtained from https:// github.com/dinghanshen/SWEM.