no code implementations • CMLC (LREC) 2022 • Vasile Pais, Maria Mitrofan, Verginica Barbu Mititelu, Elena Irimia, Roxana Micu, Carol Luca Gasan
Following the successful creation of a national representative corpus of contemporary Romanian language, we turned our attention to the social media text, as present in micro-blogging platforms.
no code implementations • LDL (ACL) 2022 • Verginica Barbu Mititelu, Elena Irimia, Vasile Pais, Andrei-Marius Avram, Maria Mitrofan
In this paper, we report on (i) the conversion of Romanian language resources to the Linked Open Data specifications and requirements, on (ii) their publication and (iii) interlinking with other language resources (for Romanian or for other languages).
no code implementations • GWC 2019 • Elena Irimia, Maria Mitrofan, Verginica Mititelu
The evaluation is made for two situations: one in which the words are not semantically disambiguated before expanding the lexicon, and another one in which they are disambiguated with senses from the Romanian wordnet.
no code implementations • LREC 2022 • Tamás Váradi, Bence Nyéki, Svetla Koeva, Marko Tadić, Vanja Štefanec, Maciej Ogrodniczuk, Bartłomiej Nitoń, Piotr Pęzik, Verginica Barbu Mititelu, Elena Irimia, Maria Mitrofan, Dan Tufiș, Radovan Garabík, Simon Krek, Andraž Repar
This article presents the current outcomes of the CURLICAT CEF Telecom project, which aims to collect and deeply annotate a set of large corpora from selected domains.
no code implementations • SMM4H (COLING) 2022 • Vasile Pais, Verginica Barbu Mititelu, Elena Irimia, Maria Mitrofan, Carol Luca Gasan, Roxana Micu
This paper introduces a manually annotated dataset for named entity recognition (NER) in micro-blogging text for Romanian language.
no code implementations • CLIB 2022 • Radu Ion, Andrei-Marius Avram, Vasile Păiş, Maria Mitrofan, Verginica Barbu Mititelu, Elena Irimia, Valentin Badea
The paper will present the QA system and its integration with the Romanian language technologies portal RELATE, the COVID-19 data set and different evaluations of the QA performance.
no code implementations • 22 Nov 2021 • Vasile Păiş, Radu Ion, Andrei-Marius Avram, Elena Irimia, Verginica Barbu Mititelu, Maria Mitrofan
The paper contains a detailed description of the acquisition process, corpus statistics as well as an evaluation of the corpus influence on a low-latency ASR system as well as a dialogue component.
no code implementations • LREC 2020 • Tam{\'a}s V{\'a}radi, Svetla Koeva, Martin Yamalov, Marko Tadi{\'c}, B{\'a}lint Sass, Bart{\l}omiej Nito{\'n}, Maciej Ogrodniczuk, Piotr P{\k{e}}zik, Verginica Barbu Mititelu, Radu Ion, Elena Irimia, Maria Mitrofan, Vasile P{\u{a}}i{\textcommabelow{s}}, Dan Tufi{\textcommabelow{s}}, Radovan Garab{\'\i}k, Simon Krek, Andraz Repar, Matja{\v{z}} Rihtar, Janez Brank
This article presents the current outcomes of the MARCELL CEF Telecom project aiming to collect and deeply annotate a large comparable corpus of legal documents.
no code implementations • LREC 2016 • Dan Tufi{\textcommabelow{s}}, Verginica Barbu Mititelu, Elena Irimia, {\textcommabelow{S}}tefan Daniel Dumitrescu, Tiberiu Boro{\textcommabelow{s}}
The article describes the current status of a large national project, CoRoLa, aiming at building a reference corpus for the contemporary Romanian language.
no code implementations • LREC 2014 • Verginica Barbu Mititelu, Elena Irimia, Dan Tufi{\textcommabelow{s}}
Our project is a joined effort of two institutes of the Romanian Academy.
no code implementations • LREC 2012 • Radu Ion, Elena Irimia, Dan {\c{S}}tef{\u{a}}nescu, Dan Tufi{\textcommabelow{s}}
This article describes the collecting, processing and validation of a large balanced corpus for Romanian.