Word Embeddings
1096 papers with code • 0 benchmarks • 52 datasets
Word embedding is the collective name for a set of language modeling and feature learning techniques in natural language processing (NLP) where words or phrases from the vocabulary are mapped to vectors of real numbers.
Techniques for learning word embeddings can include Word2Vec, GloVe, and other neural network-based approaches that train on an NLP task such as language modeling or document classification.
( Image credit: Dynamic Word Embedding for Evolving Semantic Discovery )
Benchmarks
These leaderboards are used to track progress in Word Embeddings
Datasets
Subtasks
Most implemented papers
Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models
We investigate the task of building open domain, conversational dialogue systems based on large dialogue corpora using generative models.
emoji2vec: Learning Emoji Representations from their Description
Many current natural language processing applications for social media rely on representation learning and utilize pre-trained word embeddings.
Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation
Since the introduction of the transformer model by Vaswani et al. (2017), a fundamental question has yet to be answered: how does a model achieve extrapolation at inference time for sequences that are longer than it saw during training?
ConceptNet 5.5: An Open Multilingual Graph of General Knowledge
It is designed to represent the general knowledge involved in understanding language, improving natural language applications by allowing the application to better understand the meanings behind the words people use.
BB_twtr at SemEval-2017 Task 4: Twitter Sentiment Analysis with CNNs and LSTMs
In this paper we describe our attempt at producing a state-of-the-art Twitter sentiment classifier using Convolutional Neural Networks (CNNs) and Long Short Term Memory (LSTMs) networks.
Optimal Hyperparameters for Deep LSTM-Networks for Sequence Labeling Tasks
Selecting optimal parameters for a neural network architecture can often make the difference between mediocre and state-of-the-art performance.
word2vec Explained: deriving Mikolov et al.'s negative-sampling word-embedding method
The word2vec software of Tomas Mikolov and colleagues (https://code. google. com/p/word2vec/ ) has gained a lot of traction lately, and provides state-of-the-art word embeddings.
Document Embedding with Paragraph Vectors
Paragraph Vectors has been recently proposed as an unsupervised method for learning distributed representations for pieces of texts.
Mixing Dirichlet Topic Models and Word Embeddings to Make lda2vec
Distributed dense word vectors have been shown to be effective at capturing token-level semantic and syntactic regularities in language, while topic models can form interpretable representations over documents.
Unsupervised Learning of Sentence Embeddings using Compositional n-Gram Features
The recent tremendous success of unsupervised word embeddings in a multitude of applications raises the obvious question if similar methods could be derived to improve embeddings (i. e. semantic representations) of word sequences as well.