CELEX database comprises three different searchable lexical databases, Dutch, English and German. The lexical data contained in each database is divided into five categories: orthography, phonology, morphology, syntax (word class) and word frequency.
57 PAPERS • NO BENCHMARKS YET
Wikipedia Title is a dataset for learning character-level compositionality from the character visual characteristics. It consists of a collection of Wikipedia titles in Chinese, Japanese or Korean labelled with the category to which the article belongs.
3 PAPERS • NO BENCHMARKS YET