Search Results for author: Verginica Barbu Mititelu

Found 29 papers, 0 papers with code

A Romanian Treebank Annotated with Verbal Multiword Expressions

no code implementations • CLIB 2022 • Verginica Barbu Mititelu, Mihaela Cristescu, Maria Mitrofan, Bianca-Mădălina Zgreabăn, Elena-Andreea Bărbulescu

In this paper we present a new version of the Romanian journalistic treebank annotated with verbal multiword expressions of four types: idioms, light verb constructions, reflexive verbs and inherently adpositional verbs, the last type being recently added to the corpus.

Paper
Add Code

Edition 1.2 of the PARSEME Shared Task on Semi-supervised Identification of Verbal Multiword Expressions

no code implementations • COLING (MWE) 2020 • Carlos Ramisch, Agata Savary, Bruno Guillaume, Jakub Waszczuk, Marie Candito, Ashwini Vaidya, Verginica Barbu Mititelu, Archna Bhatia, Uxoa Iñurrieta, Voula Giouli, Tunga Güngör, Menghan Jiang, Timm Lichte, Chaya Liebeskind, Johanna Monti, Renata Ramisch, Sara Stymne, Abigail Walsh, Hongzhi Xu

We present edition 1. 2 of the PARSEME shared task on identification of verbal multiword expressions (VMWEs).

Paper
Add Code

Introducing the CURLICAT Corpora: Seven-language Domain Specific Annotated Corpora from Curated Sources

no code implementations • LREC 2022 • Tamás Váradi, Bence Nyéki, Svetla Koeva, Marko Tadić, Vanja Štefanec, Maciej Ogrodniczuk, Bartłomiej Nitoń, Piotr Pęzik, Verginica Barbu Mititelu, Elena Irimia, Maria Mitrofan, Dan Tufiș, Radovan Garabík, Simon Krek, Andraž Repar

This article presents the current outcomes of the CURLICAT CEF Telecom project, which aims to collect and deeply annotate a set of large corpora from selected domains.

NMT

Paper
Add Code

Romanian micro-blogging named entity recognition including health-related entities

no code implementations • SMM4H (COLING) 2022 • Vasile Pais, Verginica Barbu Mititelu, Elena Irimia, Maria Mitrofan, Carol Luca Gasan, Roxana Micu

This paper introduces a manually annotated dataset for named entity recognition (NER) in micro-blogging text for Romanian language.

named-entity-recognition Named Entity Recognition +2

Paper
Add Code

Use Case: Romanian Language Resources in the LOD Paradigm

no code implementations • LDL (ACL) 2022 • Verginica Barbu Mititelu, Elena Irimia, Vasile Pais, Andrei-Marius Avram, Maria Mitrofan

In this paper, we report on (i) the conversion of Romanian language resources to the Linked Open Data specifications and requirements, on (ii) their publication and (iii) interlinking with other language resources (for Romanian or for other languages).

Word Embeddings

Paper
Add Code

Challenges in Creating a Representative Corpus of Romanian Micro-Blogging Text

no code implementations • CMLC (LREC) 2022 • Vasile Pais, Maria Mitrofan, Verginica Barbu Mititelu, Elena Irimia, Roxana Micu, Carol Luca Gasan

Following the successful creation of a national representative corpus of contemporary Romanian language, we turned our attention to the social media text, as present in micro-blogging platforms.

Paper
Add Code

It Takes Two to Tango – Towards a Multilingual MWE Resource

no code implementations • CLIB 2020 • Svetlozara Leseva, Verginica Barbu Mititelu, Ivelina Stoyanova

Mature wordnets offer the opportunity of digging out interesting linguistic information otherwise not explicitly marked in the network.

Vocal Bursts Valence Prediction

Paper
Add Code

Aligning the Romanian Reference Treebank and the Valence Lexicon of Romanian Verbs

no code implementations • LREC 2022 • Ana-Maria Barbu, Verginica Barbu Mititelu, Cătălin Mititelu

We present here the efforts of aligning two language resources for Romanian: the Romanian Reference Treebank and the Valence Lexicon of Romanian Verbs: for each occurrence of those verbs in the treebank that were included as entries in the lexicon, a set of valence frames is automatically assigned, then manually validated by two linguists and, when necessary, corrected.

Paper
Add Code

A Customizable WordNet Editor

no code implementations • CLIB 2020 • Andrei-Marius Avram, Verginica Barbu Mititelu

This paper presents an open-source wordnet editor that has been developed to ensure further expansion of the Romanian wordnet.

Paper
Add Code

Multilingual Multiword Expression Identification Using Lateral Inhibition and Domain Adaptation

no code implementations • 17 Jun 2023 • Andrei-Marius Avram, Verginica Barbu Mititelu, Vasile Păiş, Dumitru-Clementin Cercel, Ştefan Trăuşan-Matu

Correctly identifying multiword expressions (MWEs) is an important task for most natural language processing systems since their misidentification can result in ambiguity and misunderstanding of the underlying text.

Domain Adaptation

Paper
Add Code

Romanian Multiword Expression Detection Using Multilingual Adversarial Training and Lateral Inhibition

no code implementations • 22 Apr 2023 • Andrei-Marius Avram, Verginica Barbu Mititelu, Dumitru-Clementin Cercel

Multiword expressions are a key ingredient for developing large-scale and linguistically sound natural language processing technology.

Paper
Add Code

An Open-Domain QA System for e-Governance

no code implementations • CLIB 2022 • Radu Ion, Andrei-Marius Avram, Vasile Păiş, Maria Mitrofan, Verginica Barbu Mititelu, Elena Irimia, Valentin Badea

The paper will present the QA system and its integration with the Romanian language technologies portal RELATE, the COVID-19 data set and different evaluations of the QA performance.

Open-Domain Question Answering

Paper
Add Code

Human-Machine Interaction Speech Corpus from the ROBIN project

no code implementations • 22 Nov 2021 • Vasile Păiş, Radu Ion, Andrei-Marius Avram, Elena Irimia, Verginica Barbu Mititelu, Maria Mitrofan

The paper contains a detailed description of the acquisition process, corpus statistics as well as an evaluation of the corpus influence on a low-latency ASR system as well as a dialogue component.

Paper
Add Code

The MARCELL Legislative Corpus

no code implementations • LREC 2020 • Tam{\'a}s V{\'a}radi, Svetla Koeva, Martin Yamalov, Marko Tadi{\'c}, B{\'a}lint Sass, Bart{\l}omiej Nito{\'n}, Maciej Ogrodniczuk, Piotr P{\k{e}}zik, Verginica Barbu Mititelu, Radu Ion, Elena Irimia, Maria Mitrofan, Vasile P{\u{a}}i{\textcommabelow{s}}, Dan Tufi{\textcommabelow{s}}, Radovan Garab{\'\i}k, Simon Krek, Andraz Repar, Matja{\v{z}} Rihtar, Janez Brank

This article presents the current outcomes of the MARCELL CEF Telecom project aiming to collect and deeply annotate a large comparable corpus of legal documents.

Sentence

Paper
Add Code

Hear about Verbal Multiword Expressions in the Bulgarian and the Romanian Wordnets Straight from the Horse's Mouth

no code implementations • WS 2019 • Verginica Barbu Mititelu, Ivelina Stoyanova, Svetlozara Leseva, Maria Mitrofan, Tsvetana Dimitrova, Maria Todorova

The contribution of this work is in outlining essential features of the description and classification of VMWEs and the cross-language comparison at the lexical level, which is essential for the understanding of the need for uniform annotation guidelines and a viable procedure for validation of the annotation.

Classification General Classification

Paper
Add Code

MoNERo: a Biomedical Gold Standard Corpus for the Romanian Language

no code implementations • WS 2019 • Maria Mitrofan, Verginica Barbu Mititelu, Grigorina Mitrofan

In an era when large amounts of data are generated daily in various fields, the biomedical field among others, linguistic resources can be exploited for various tasks of Natural Language Processing.

Paper
Add Code

The Romanian Corpus Annotated with Verbal Multiword Expressions

no code implementations • WS 2019 • Verginica Barbu Mititelu, Mihaela Cristescu, Mihaela Onofrei

This paper reports on the Romanian journalistic corpus annotated with verbal multiword expressions following the PARSEME guidelines.

Sentence

Paper
Add Code

A hybrid pipeline of rules and machine learning to filter web-crawled parallel corpora

no code implementations • WS 2018 • Eduard Barbu, Verginica Barbu Mititelu

A hybrid pipeline comprising rules and machine learning is used to filter a noisy web English-German parallel corpus for the Parallel Corpus Filtering task.

BIG-bench Machine Learning Machine Translation +3

Paper
Add Code

Edition 1.1 of the PARSEME Shared Task on Automatic Identification of Verbal Multiword Expressions

no code implementations • COLING 2018 • Carlos Ramisch, Silvio Ricardo Cordeiro, Agata Savary, Veronika Vincze, Verginica Barbu Mititelu, Archna Bhatia, Maja Buljan, C, Marie ito, Polona Gantar, Voula Giouli, Tunga G{\"u}ng{\"o}r, Abdelati Hawwari, Uxoa I{\~n}urrieta, Jolanta Kovalevskait{\.e}, Simon Krek, Timm Lichte, Chaya Liebeskind, Johanna Monti, Carla Parra Escart{\'\i}n, Behrang Qasemizadeh, Renata Ramisch, Nathan Schneider, Ivelina Stoyanova, Ashwini Vaidya, Abigail Walsh

Corpora were created for 20 languages, which are also briefly discussed.

Paper
Add Code

Ensemble Romanian Dependency Parsing with Neural Networks

no code implementations • LREC 2018 • Radu Ion, Elena Irimia, Verginica Barbu Mititelu

Dependency Parsing Word Embeddings

Paper
Add Code

The Reference Corpus of the Contemporary Romanian Language (CoRoLa)

no code implementations • LREC 2018 • Verginica Barbu Mititelu, Dan Tufi{\textcommabelow{s}}, Elena Irimia

Paper
Add Code

A data-driven approach to verbal multiword expression detection. PARSEME Shared Task system description paper

no code implementations • WS 2017 • Tiberiu Boros, Sonia Pipa, Verginica Barbu Mititelu, Dan Tufis

{``}Multiword expressions{''} are groups of words acting as a morphologic, syntactic and semantic unit in linguistic analysis.

feature selection Lemmatization +1

Paper
Add Code

The IPR-cleared Corpus of Contemporary Written and Spoken Romanian Language

no code implementations • LREC 2016 • Dan Tufi{\textcommabelow{s}}, Verginica Barbu Mititelu, Elena Irimia, {\textcommabelow{S}}tefan Daniel Dumitrescu, Tiberiu Boro{\textcommabelow{s}}

The article describes the current status of a large national project, CoRoLa, aiming at building a reference corpus for the contemporary Romanian language.

Lemmatization Part-Of-Speech Tagging

Paper
Add Code

Universal and Language-specific Dependency Relations for Analysing Romanian

no code implementations • WS 2015 • Verginica Barbu Mititelu, C{\u{a}}t{\u{a}}lina M{\u{a}}r{\u{a}}nduc, Elena Irimia

Paper
Add Code

WordFinder

no code implementations • WS 2014 • Catalin Mititelu, Verginica Barbu Mititelu

Paper
Add Code

RACAI GEC -- A hybrid approach to Grammatical Error Correction

no code implementations • WS 2014 • Tiberiu Boro{\textcommabelow{s}}, Stefan Daniel Dumitrescu, Adrian Zafiu, Verginica Barbu Mititelu, Ionut Paul V{\u{a}}duva

Grammatical Error Detection

Paper
Add Code

CoRoLa --- The Reference Corpus of Contemporary Romanian Language

no code implementations • LREC 2014 • Verginica Barbu Mititelu, Elena Irimia, Dan Tufi{\textcommabelow{s}}

Our project is a joined effort of two institutes of the Romanian Academy.

Lemmatization Sentence

Paper
Add Code

News about the Romanian Wordnet

no code implementations • WS 2014 • Verginica Barbu Mititelu, {\textcommabelow{S}}tefan Daniel Dumitrescu, Dan Tufi{\textcommabelow{s}}

Machine Translation Question Answering +1

Paper
Add Code

Adding Morpho-semantic Relations to the Romanian Wordnet

no code implementations • LREC 2012 • Verginica Barbu Mititelu

Keeping pace with other wordnets development, we present the challenges raised by the Romanian derivational system and our methodology for identifying derived words and their stems in the Romanian Wordnet.

Information Retrieval Question Answering

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.