TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Sentiment Analysis	MR	VLAWE	Accuracy	93.3	# 1
Text Classification	MR	VLAWE	Accuracy	93.3	# 1
Multi-Label Text Classification	Reuters-21578	VLAWE	Micro-F1	89.3	# 6
Document Classification	Reuters-21578	VLAWE	F1	89.3	# 2
Subjectivity Analysis	SUBJ	VLAWE	Accuracy	95.0	# 6
Text Classification	TREC-6	VLAWE	Error	5.8	# 11

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/vector-of-locally-aggregated-word-embeddings/sentiment-analysis-on-mr)](https://paperswithcode.com/sota/sentiment-analysis-on-mr?p=vector-of-locally-aggregated-word-embeddings)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/vector-of-locally-aggregated-word-embeddings/text-classification-on-mr)](https://paperswithcode.com/sota/text-classification-on-mr?p=vector-of-locally-aggregated-word-embeddings)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/vector-of-locally-aggregated-word-embeddings/document-classification-on-reuters-21578)](https://paperswithcode.com/sota/document-classification-on-reuters-21578?p=vector-of-locally-aggregated-word-embeddings)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/vector-of-locally-aggregated-word-embeddings/multi-label-text-classification-on-reuters-1)](https://paperswithcode.com/sota/multi-label-text-classification-on-reuters-1?p=vector-of-locally-aggregated-word-embeddings)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/vector-of-locally-aggregated-word-embeddings/subjectivity-analysis-on-subj)](https://paperswithcode.com/sota/subjectivity-analysis-on-subj?p=vector-of-locally-aggregated-word-embeddings)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/vector-of-locally-aggregated-word-embeddings/text-classification-on-trec-6)](https://paperswithcode.com/sota/text-classification-on-trec-6?p=vector-of-locally-aggregated-word-embeddings)`

Vector of Locally-Aggregated Word Embeddings (VLAWE): A Novel Document-level Representation

NAACL 2019 · Radu Tudor Ionescu, Andrei M. Butnaru ·

In this paper, we propose a novel representation for text documents based on aggregating word embedding vectors into document embeddings. Our approach is inspired by the Vector of Locally-Aggregated Descriptors used for image representation, and it works as follows. First, the word embeddings gathered from a collection of documents are clustered by k-means in order to learn a codebook of semnatically-related word embeddings. Each word embedding is then associated to its nearest cluster centroid (codeword). The Vector of Locally-Aggregated Word Embeddings (VLAWE) representation of a document is then computed by accumulating the differences between each codeword vector and each word vector (from the document) associated to the respective codeword. We plug the VLAWE representation, which is learned in an unsupervised manner, into a classifier and show that it is useful for a diverse set of text classification tasks. We compare our approach with a broad range of recent state-of-the-art methods, demonstrating the effectiveness of our approach. Furthermore, we obtain a considerable improvement on the Movie Review data set, reporting an accuracy of 93.3%, which represents an absolute gain of 10% over the state-of-the-art approach. Our code is available at https://github.com/raduionescu/vlawe-boswe/.

PDF Abstract NAACL 2019 PDF NAACL 2019 Abstract

Code

Add Remove Mark official

raduionescu/vlawe-boswe official

Tasks

Add Remove

Multi-Label Text Classification

Sentiment Analysis

Subjectivity Analysis

text-classification

Text Classification

Word Embeddings

Datasets

Reuters-21578

MR SUBJ

Results from the Paper

Edit

Ranked #1 on Sentiment Analysis on MR

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Sentiment Analysis	MR	VLAWE	Accuracy	93.3	# 1	Compare
Text Classification	MR	VLAWE	Accuracy	93.3	# 1	Compare
Multi-Label Text Classification	Reuters-21578	VLAWE	Micro-F1	89.3	# 6	Compare
Document Classification	Reuters-21578	VLAWE	F1	89.3	# 2	Compare
Subjectivity Analysis	SUBJ	VLAWE	Accuracy	95.0	# 6	Compare
Text Classification	TREC-6	VLAWE	Error	5.8	# 11	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Vector of Locally-Aggregated Word Embeddings (VLAWE): A Novel Document-level Representation

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove