Word Embeddings

Continuous Bag-of-Words Word2Vec

Introduced by Mikolov et al. in Efficient Estimation of Word Representations in Vector Space

Continuous Bag-of-Words Word2Vec is an architecture for creating word embeddings that uses $n$ future words as well as $n$ past words to create a word embedding. The objective function for CBOW is:

$$ J_\theta = \frac{1}{T}\sum^{T}_{t=1}\log{p}\left(w_{t}\mid{w}_{t-n},\ldots,w_{t-1}, w_{t+1},\ldots,w_{t+n}\right) $$

In the CBOW model, the distributed representations of context are used to predict the word in the middle of the window. This contrasts with Skip-gram Word2Vec where the distributed representation of the input word is used to predict the context.

Source: Efficient Estimation of Word Representations in Vector Space


Paper Code Results Date Stars


Task Papers Share
Dependency Parsing 1 10.00%
Lemmatization 1 10.00%
NER 1 10.00%
Information Retrieval 1 10.00%
Retrieval 1 10.00%
Specificity 1 10.00%
Sentence 1 10.00%
Natural Language Inference 1 10.00%
Test 1 10.00%


Component Type
🤖 No Components Found You can add them if they exist; e.g. Mask R-CNN uses RoIAlign