Automatic segmentation, tokenization and morphological and syntactic annotations of raw texts in 45 languages, generated by UDPipe (http://ufal.mff.cuni.cz/udpipe), together with word embeddings of dimension
1 PAPER • NO BENCHMARKS YET
…Annotations include: Multiple POS tags, morphological features and lemmatization Sentence segmentation and rough speech act Document structure in TEI XML (paragraphs, headings, figures, etc.)
8 PAPERS • 1 BENCHMARK