Continuous Bag-of-Words Word2Vec

Introduced by Mikolov et al. in Efficient Estimation of Word Representations in Vector Space

Continuous Bag-of-Words Word2Vec is an architecture for creating word embeddings that uses $n$ future words as well as $n$ past words to create a word embedding. The objective function for CBOW is:

$$ J_\theta = \frac{1}{T}\sum^{T}_{t=1}\log{p}\left(w_{t}\mid{w}_{t-n},\ldots,w_{t-1}, w_{t+1},\ldots,w_{t+n}\right) $$

In the CBOW model, the distributed representations of context are used to predict the word in the middle of the window. This contrasts with Skip-gram Word2Vec where the distributed representation of the input word is used to predict the context.

Source: Efficient Estimation of Word Representations in Vector Space

Read Paper See Code

Papers

Paper	Code	Results	Date	Stars

Tasks

Task	Papers	Share
Dependency Parsing	1	11.11%
Lemmatization	1	11.11%
NER	1	11.11%
Information Retrieval	1	11.11%
Retrieval	1	11.11%
Specificity	1	11.11%
Sentence	1	11.11%
Natural Language Inference	1	11.11%
Word Similarity	1	11.11%

Usage Over Time

This feature is experimental; we are continuously improving our matching algorithm.

Components

Component	Type	Add Remove
🤖 No Components Found	You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories

Add Remove

Word Embeddings

Static Word Embeddings