Search Results for author: Jan Hajič

Found 9 papers, 2 papers with code

Prague Dependency Treebank -- Consolidated 1.0

no code implementations5 Jun 2020 Jan Hajič, Eduard Bejček, Jaroslava Hlaváčová, Marie Mikulová, Milan Straka, Jan Štěpánek, Barbora Štěpánková

We present a richly annotated and genre-diversified language resource, the Prague Dependency Treebank-Consolidated 1. 0 (PDT-C 1. 0), the purpose of which is - as it always been the case for the family of the Prague Dependency Treebanks - to serve both as a training data for various types of NLP tasks as well as for linguistically-oriented research.


Universal Dependencies v2: An Evergrowing Multilingual Treebank Collection

no code implementations LREC 2020 Joakim Nivre, Marie-Catherine de Marneffe, Filip Ginter, Jan Hajič, Christopher D. Manning, Sampo Pyysalo, Sebastian Schuster, Francis Tyers, Daniel Zeman

Universal Dependencies is an open community effort to create cross-linguistically consistent treebank annotation for many languages within a dependency-based lexicalist framework.

Czech Text Processing with Contextual Embeddings: POS Tagging, Lemmatization, Parsing and NER

no code implementations8 Sep 2019 Milan Straka, Jana Straková, Jan Hajič

We evaluate two meth ods for precomputing such embeddings, BERT and Flair, on four Czech text processing tasks: part-of-speech (POS) tagging, lemmatization, dependency pars ing and named entity recognition (NER).

Dependency Parsing Lemmatization +4

Evaluating Contextualized Embeddings on 54 Languages in POS Tagging, Lemmatization and Dependency Parsing

no code implementations20 Aug 2019 Milan Straka, Jana Straková, Jan Hajič

We present an extensive evaluation of three recently proposed methods for contextualized embeddings on 89 corpora in 54 languages of the Universal Dependencies 2. 3 in three tasks: POS tagging, lemmatization, and dependency parsing.

Dependency Parsing Lemmatization +2

Neural Architectures for Nested NER through Linearization

1 code implementation ACL 2019 Jana Straková, Milan Straka, Jan Hajič

We propose two neural network architectures for nested named entity recognition (NER), a setting in which named entities may overlap and also be labeled with more than one label.

Hard Attention named-entity-recognition +3

LemmaTag: Jointly Tagging and Lemmatizing for Morphologically-Rich Languages with BRNNs

2 code implementations10 Aug 2018 Daniel Kondratyuk, Tomáš Gavenčiak, Milan Straka, Jan Hajič

We present LemmaTag, a featureless neural network architecture that jointly generates part-of-speech tags and lemmas for sentences by using bidirectional RNNs with character-level and word-level embeddings.

Lemmatization Part-Of-Speech Tagging +1

Cannot find the paper you are looking for? You can Submit a new open access paper.