no code implementations • LREC 2020 • Jan Haji{\v{c}}, Eduard Bej{\v{c}}ek, Jaroslava Hlavacova, Marie Mikulov{\'a}, Milan Straka, Jan {\v{S}}t{\v{e}}p{\'a}nek, Barbora {\v{S}}t{\v{e}}p{\'a}nkov{\'a}
We present a richly annotated and genre-diversified language resource, the Prague Dependency Treebank-Consolidated 1. 0 (PDT-C 1. 0), the purpose of which is - as it always been the case for the family of the Prague Dependency Treebanks - to serve both as a training data for various types of NLP tasks as well as for linguistically-oriented research.
no code implementations • CONLL 2018 • Daniel Zeman, Jan Haji{\v{c}}, Martin Popel, Martin Potthast, Milan Straka, Filip Ginter, Joakim Nivre, Slav Petrov
Every year, the Conference on Computational Natural Language Learning (CoNLL) features a shared task, in which participants train and test their learning systems on the same data sets.
1 code implementation • EMNLP 2018 • Daniel Kondratyuk, Tom{\'a}{\v{s}} Gaven{\v{c}}iak, Milan Straka, Jan Haji{\v{c}}
We present LemmaTag, a featureless neural network architecture that jointly generates part-of-speech tags and lemmas for sentences by using bidirectional RNNs with character-level and word-level embeddings.
no code implementations • COLING 2018 • Zde{\v{n}}ka Ure{\v{s}}ov{\'a}, Eva Fu{\v{c}}{\'\i}kov{\'a}, Eva Haji{\v{c}}ov{\'a}, Jan Haji{\v{c}}
This paper describes CzEngClass, a bilingual lexical resource being built to investigate verbal synonymy in bilingual context and to relate semantic roles common to one synonym class to verb arguments (verb valency).
no code implementations • CONLL 2017 • Daniel Zeman, Martin Popel, Milan Straka, Jan Haji{\v{c}}, Joakim Nivre, Filip Ginter, Juhani Luotolahti, Sampo Pyysalo, Slav Petrov, Martin Potthast, Francis Tyers, Elena Badmaeva, Memduh Gokirmak, Anna Nedoluzhko, Silvie Cinkov{\'a}, Jan Haji{\v{c}} jr., Jaroslava Hlav{\'a}{\v{c}}ov{\'a}, V{\'a}clava Kettnerov{\'a}, Zde{\v{n}}ka Ure{\v{s}}ov{\'a}, Jenna Kanerva, Stina Ojala, Anna Missil{\"a}, Christopher D. Manning, Sebastian Schuster, Siva Reddy, Dima Taji, Nizar Habash, Herman Leung, Marie-Catherine de Marneffe, Manuela Sanguinetti, Maria Simi, Hiroshi Kanayama, Valeria de Paiva, Kira Droganova, H{\'e}ctor Mart{\'\i}nez Alonso, {\c{C}}a{\u{g}}r{\i} {\c{C}}{\"o}ltekin, Umut Sulubacak, Hans Uszkoreit, Vivien Macketanz, Aljoscha Burchardt, Kim Harris, Katrin Marheinecke, Georg Rehm, Tolga Kayadelen, Mohammed Attia, Ali Elkahky, Zhuoran Yu, Emily Pitler, Saran Lertpradit, M, Michael l, Jesse Kirchner, Hector Fern Alcalde, ez, Jana Strnadov{\'a}, Esha Banerjee, Ruli Manurung, Antonio Stella, Atsuko Shimada, Sookyoung Kwak, Gustavo Mendon{\c{c}}a, L, Tatiana o, Rattima Nitisaroj, Josie Li
The Conference on Computational Natural Language Learning (CoNLL) features a shared task, in which participants train and test their learning systems on the same data sets.
no code implementations • COLING 2016 • Eva Fu{\v{c}}{\'\i}kov{\'a}, Jan Haji{\v{c}}, Zde{\v{n}}ka Ure{\v{s}}ov{\'a}
In this paper and the associated system demo, we present an advanced search system that allows to perform a joint search over a (bilingual) valency lexicon and a correspondingly annotated linked parallel corpus.
no code implementations • WS 2016 • Eva Fu{\v{c}}{\'\i}kov{\'a}, Jan Haji{\v{c}}, Zde{\v{n}}ka Ure{\v{s}}ov{\'a}
The motivation for the task is to extend a verbal valency (i. e., predicate-argument) lexicon by adding nouns that share the valency properties with the base verb, assuming their properties can be derived (even if not trivially) from the underlying verb by deterministic grammatical rules.
no code implementations • LREC 2016 • Milan Straka, Jan Haji{\v{c}}, Jana Strakov{\'a}
Automatic natural language processing of large texts often presents recurring challenges in multiple languages: even for most advanced tasks, the texts are first processed by basic processing steps {--} from tokenization to parsing.
no code implementations • LREC 2016 • Arantxa Otegi, Nora Aranberri, Antonio Branco, Jan Haji{\v{c}}, Martin Popel, Kiril Simov, Eneko Agirre, Petya Osenova, Rita Pereira, Jo{\~a}o Silva, Steven Neale
This work presents parallel corpora automatically annotated with several NLP tools, including lemma and part-of-speech tagging, named-entity recognition and classification, named-entity disambiguation, word-sense disambiguation, and coreference.
no code implementations • LREC 2016 • Joakim Nivre, Marie-Catherine de Marneffe, Filip Ginter, Yoav Goldberg, Jan Haji{\v{c}}, Christopher D. Manning, Ryan Mcdonald, Slav Petrov, Sampo Pyysalo, Natalia Silveira, Reut Tsarfaty, Daniel Zeman
Cross-linguistically consistent annotation is necessary for sound comparative evaluation and cross-lingual learning experiments.
no code implementations • LREC 2016 • Stephan Oepen, Marco Kuhlmann, Yusuke Miyao, Daniel Zeman, Silvie Cinkov{\'a}, Dan Flickinger, Jan Haji{\v{c}}, Angelina Ivanova, Zde{\v{n}}ka Ure{\v{s}}ov{\'a}
We announce a new language resource for research on semantic parsing, a large, carefully curated collection of semantic dependency graphs representing multiple linguistic traditions.
no code implementations • LREC 2016 • Georg Rehm, Jan Haji{\v{c}}, Josef van Genabith, Andrejs Vasiljevs
META-NET is a European network of excellence, founded in 2010, that consists of 60 research centres in 34 European countries.
no code implementations • LREC 2014 • Koenraad De Smedt, Erhard Hinrichs, Detmar Meurers, Inguna Skadi{\c{n}}a, Bolette Pedersen, Costanza Navarretta, N{\'u}ria Bel, Krister Lind{\'e}n, Mark{\'e}ta Lopatkov{\'a}, Jan Haji{\v{c}}, Gisle Andersen, Przemyslaw Lenkiewicz
CLARA (Common Language Resources and Their Applications) is a Marie Curie Initial Training Network which ran from 2009 until 2014 with the aim of providing researcher training in crucial areas related to language resources and infrastructure.
no code implementations • LREC 2014 • Nianwen Xue, Ond{\v{r}}ej Bojar, Jan Haji{\v{c}}, Martha Palmer, Zde{\v{n}}ka Ure{\v{s}}ov{\'a}, Xiuhong Zhang
Abstract Meaning Representations (AMRs) are rooted, directional and labeled graphs that abstract away from morpho-syntactic idiosyncrasies such as word category (verbs and nouns), word order, and function words (determiners, some prepositions).
no code implementations • LREC 2014 • Zde{\v{n}}ka Ure{\v{s}}ov{\'a}, Jan Haji{\v{c}}, Pavel Pecina, Ond{\v{r}}ej Du{\v{s}}ek
This paper presents development and test sets for machine translation of search queries in cross-lingual information retrieval in the medical domain.
no code implementations • LREC 2014 • Georg Rehm, Hans Uszkoreit, Sophia Ananiadou, N{\'u}ria Bel, Audron{\.e} Bielevi{\v{c}}ien{\.e}, Lars Borin, Ant{\'o}nio Branco, Gerhard Budin, Nicoletta Calzolari, Walter Daelemans, Radovan Garab{\'\i}k, Marko Grobelnik, Carmen Garc{\'\i}a-Mateo, Josef van Genabith, Jan Haji{\v{c}}, Inma Hern{\'a}ez, John Judge, Svetla Koeva, Simon Krek, Cvetana Krstev, Krister Lind{\'e}n, Bernardo Magnini, Joseph Mariani, John McNaught, Maite Melero, Monica Monachini, Asunci{\'o}n Moreno, Jan Odijk, Maciej Ogrodniczuk, Piotr P{\k{e}}zik, Stelios Piperidis, Adam Przepi{\'o}rkowski, Eir{\'\i}kur R{\"o}gnvaldsson, Michael Rosner, Bolette Pedersen, Inguna Skadi{\c{n}}a, Koenraad De Smedt, Marko Tadi{\'c}, Paul Thompson, Dan Tufi{\c{s}}, Tam{\'a}s V{\'a}radi, Andrejs Vasi{\c{l}}jevs, Kadri Vider, Jolanta Zabarskaite
This article provides an overview of the dissemination work carried out in META-NET from 2010 until early 2014; we describe its impact on the regional, national and international level, mainly with regard to politics and the situation of funding for LT topics.
no code implementations • TACL 2013 • Bernd Bohnet, Joakim Nivre, Igor Boguslavsky, Rich{\'a}rd Farkas, Filip Ginter, Jan Haji{\v{c}}
Joint morphological and syntactic analysis has been proposed as a way of improving parsing accuracy for richly inflected languages.
no code implementations • LREC 2012 • Jan Haji{\v{c}}, Eva Haji{\v{c}}ov{\'a}, Jarmila Panevov{\'a}, Petr Sgall, Ond{\v{r}}ej Bojar, Silvie Cinkov{\'a}, Eva Fu{\v{c}}{\'\i}kov{\'a}, Marie Mikulov{\'a}, Petr Pajas, Jan Popelka, Ji{\v{r}}{\'\i} Semeck{\'y}, Jana {\v{S}}indlerov{\'a}, Jan {\v{S}}t{\v{e}}p{\'a}nek, Josef Toman, Zde{\v{n}}ka Ure{\v{s}}ov{\'a}, Zden{\v{e}}k {\v{Z}}abokrtsk{\'y}
We introduce a substantial update of the Prague Czech-English Dependency Treebank, a parallel corpus manually annotated at the deep syntactic layer of linguistic representation.
no code implementations • LREC 2012 • Daniel Zeman, David Mare{\v{c}}ek, Martin Popel, Loganathan Ramasamy, Jan {\v{S}}t{\v{e}}p{\'a}nek, Zden{\v{e}}k {\v{Z}}abokrtsk{\'y}, Jan Haji{\v{c}}
We propose HamleDT ― HArmonized Multi-LanguagE Dependency Treebank.