Weighted Neural Bag-of-n-grams Model: New Baselines for Text Classification

COLING 2016 · Bofang Li, Zhe Zhao, Tao Liu, Puwei Wang, Xiaoyong Du ·

NBSVM is one of the most popular methods for text classification and has been widely used as baselines for various text representation approaches. It uses Naive Bayes (NB) feature to weight sparse bag-of-n-grams representation. N-gram captures word order in short context and NB feature assigns more weights to those important words. However, NBSVM suffers from sparsity problem and is reported to be exceeded by newly proposed distributed (dense) text representations learned by neural networks. In this paper, we transfer the n-grams and NB weighting to neural models. We train n-gram embeddings and use NB weighting to guide the neural models to focus on important words. In fact, our methods can be viewed as distributed (dense) counterparts of sparse bag-of-n-grams in NBSVM. We discover that n-grams and NB weighting are also effective in distributed representations. As a result, our models achieve new strong baselines on 9 text classification datasets, e.g. on IMDB dataset, we reach performance of 93.5{\%} accuracy, which exceeds previous state-of-the-art results obtained by deep neural models. All source codes are publicly available at \url{https://github.com/zhezhaoa/neural_BOW_toolkit}.