TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK	REMOVE
Language Modelling	One Billion Word	Sparse Non-Negative	PPL	52.9	# 23
Language Modelling	One Billion Word	Sparse Non-Negative	Number of params	33B	# 1

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/skip-gram-language-modeling-using-sparse-non/language-modelling-on-one-billion-word)](https://paperswithcode.com/sota/language-modelling-on-one-billion-word?p=skip-gram-language-modeling-using-sparse-non)`

Skip-gram Language Modeling Using Sparse Non-negative Matrix Probability Estimation

3 Dec 2014 · Noam Shazeer, Joris Pelemans, Ciprian Chelba ·

We present a novel family of language model (LM) estimation techniques named Sparse Non-negative Matrix (SNM) estimation. A first set of experiments empirically evaluating it on the One Billion Word Benchmark shows that SNM $n$-gram LMs perform almost as well as the well-established Kneser-Ney (KN) models. When using skip-gram features the models are able to match the state-of-the-art recurrent neural network (RNN) LMs; combining the two modeling techniques yields the best known result on the benchmark. The computational advantages of SNM over both maximum entropy and RNN LM estimation are probably its main strength, promising an approach that has the same flexibility in combining arbitrary features effectively and yet should scale to very large amounts of data as gracefully as $n$-gram LMs do.

PDF Abstract

Code

Add Remove Mark official

No code implementations yet. Submit your code now

Tasks

Add Remove

Language Modelling

Datasets

Billion Word Benchmark One Billion Word Benchmark

Results from the Paper

Edit

Ranked #23 on Language Modelling on One Billion Word

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Result	Benchmark
Language Modelling	One Billion Word	Sparse Non-Negative	PPL	52.9	# 23		Compare
Language Modelling	One Billion Word	Sparse Non-Negative	Number of params	33B	# 1		Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Skip-gram Language Modeling Using Sparse Non-negative Matrix Probability Estimation

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove