Training data for Hebrew morphological word segmentation
2 PAPERS • 1 BENCHMARK
1 PAPER • 1 BENCHMARK
…text segmentation and text segment classification) tasks and comprises 169 documents and gold standard annotations for page segments Partition (P2) contains 75 documents with a significantly richer
1 PAPER • NO BENCHMARKS YET
Automatic segmentation, tokenization and morphological and syntactic annotations of raw texts in 45 languages, generated by UDPipe (http://ufal.mff.cuni.cz/udpipe), together with word embeddings of dimension
We present YTSeg, a topically and structurally diverse benchmark for the text segmentation task based on YouTube transcriptions.
1 PAPER • 2 BENCHMARKS