1 code implementation • FNP (COLING) 2020 • Marius Ionescu, Andrei-Marius Avram, George-Andrei Dima, Dumitru-Clementin Cercel, Mihai Dascalu
Financial causality detection is centered on identifying connections between different assets from financial news in order to improve trading strategies.
no code implementations • EACL (VarDial) 2021 • George-Eduard Zaharia, Andrei-Marius Avram, Dumitru-Clementin Cercel, Traian Rebedea
Dialect identification is a task with applicability in a vast array of domains, ranging from automatic speech recognition to opinion mining.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
no code implementations • SMM4H (COLING) 2020 • George-Andrei Dima, Andrei-Marius Avram, Dumitru-Clementin Cercel
This paper describes our solutions submitted to the Social Media Mining for Health Applications (#SMM4H) Shared Task 2020.
no code implementations • VarDial (COLING) 2020 • George-Eduard Zaharia, Andrei-Marius Avram, Dumitru-Clementin Cercel, Traian Rebedea
Dialect identification represents a key aspect for improving a series of tasks, for example, opinion mining, considering that the location of the speaker can greatly influence the attitude towards a subject.
no code implementations • NAACL (SMM4H) 2021 • George-Andrei Dima, Dumitru-Clementin Cercel, Mihai Dascalu
This paper presents our contribution to the Social Media Mining for Health Applications Shared Task 2021.
no code implementations • 11 Jan 2024 • Adrian Gheorghiu, Iulian-Marius Tăiatu, Dumitru-Clementin Cercel, Iuliana Marin, Florin Pop
As the RoCoLe dataset is imbalanced and does not have many samples, fine-tuning of pre-trained models and multiple augmentation techniques need to be used.
no code implementations • 30 Dec 2023 • Sebastian-Vasile Echim, Iulian-Marius Tăiatu, Dumitru-Clementin Cercel, Florin Pop
Through our experiments, we determine that on a benchmark dataset, the robustness can be the price of the classification accuracy with performance reductions of 3%-20% for regular tests and gains of 50%-70% for adversarial attack tests.
no code implementations • 7 Oct 2023 • Emilian-Claudiu Mănescu, Răzvan-Alexandru Smădu, Andrei-Marius Avram, Dumitru-Clementin Cercel, Florin Pop
Lip reading or visual speech recognition has gained significant attention in recent years, particularly because of hardware development and innovations in computer vision.
1 code implementation • 16 Aug 2023 • Vlad-Constantin Lungu-Stan, Dumitru-Clementin Cercel, Florin Pop
By adding classification heads at each level of the transformer and employing a cascading distillation process, we improve the balanced multi-class accuracy of the base model by 2. 1%, while creating a range of models of various sizes but comparable performance.
no code implementations • 4 Aug 2023 • Răzvan-Alexandru Smădu, Sebastian-Vasile Echim, Dumitru-Clementin Cercel, Iuliana Marin, Florin Pop
In the current work, we explore the effects of various unsupervised domain adaptation techniques between two text classification tasks: fake and hyperpartisan news detection.
no code implementations • 2 Aug 2023 • Andrei-Alexandru Preda, Dumitru-Clementin Cercel, Traian Rebedea, Costin-Gabriel Chiru
This paper describes the solutions submitted by the UPB team to the AuTexTification shared task, featured as part of IberLEF-2023.
no code implementations • 30 Jun 2023 • Andrei-Marius Avram, Răzvan-Alexandru Smădu, Vasile Păiş, Dumitru-Clementin Cercel, Radu Ion, Dan Tufiş
With the rise of bidirectional encoder representations from Transformer models in natural language processing, the speech community has adopted some of their development methodologies.
no code implementations • 17 Jun 2023 • Andrei-Marius Avram, Verginica Barbu Mititelu, Vasile Păiş, Dumitru-Clementin Cercel, Ştefan Trăuşan-Matu
Correctly identifying multiword expressions (MWEs) is an important task for most natural language processing systems since their misidentification can result in ambiguity and misunderstanding of the underlying text.
no code implementations • 13 Jun 2023 • Sebastian-Vasile Echim, Răzvan-Alexandru Smădu, Andrei-Marius Avram, Dumitru-Clementin Cercel, Florin Pop
Satire detection and sentiment analysis are intensively explored natural language processing (NLP) tasks that study the identification of the satirical tone from texts and extracting sentiments in relationship with their targets.
no code implementations • 11 Jun 2023 • Iulian-Marius Tăiatu, Andrei-Marius Avram, Dumitru-Clementin Cercel, Florin Pop
Developing natural language processing (NLP) systems for social media analysis remains an important topic in artificial intelligence research.
no code implementations • 22 Apr 2023 • Andrei-Marius Avram, Verginica Barbu Mititelu, Dumitru-Clementin Cercel
Multiword expressions are a key ingredient for developing large-scale and linguistically sound natural language processing technology.
no code implementations • 30 Dec 2022 • Răzvan-Alexandru Smădu, George-Eduard Zaharia, Andrei-Marius Avram, Dumitru-Clementin Cercel, Mihai Dascalu, Florin Pop
Keyphrase identification and classification is a Natural Language Processing and Information Retrieval task that involves extracting relevant groups of words from a given text related to the main topic.
1 code implementation • SemEval (NAACL) 2022 • Andrei Paraschiv, Mihai Dascalu, Dumitru-Clementin Cercel
In recent times, the detection of hate-speech, offensive, or abusive language in online media has become an important topic in NLP research due to the exponential growth of social media and the propagation of such messages, as well as their impact.
no code implementations • ACL 2022 • George-Eduard Zaharia, Răzvan-Alexandru Smădu, Dumitru-Clementin Cercel, Mihai Dascalu
Our model obtains a boost of up to 2. 42% in terms of Pearson Correlation Coefficients in contrast to vanilla training techniques, when considering the CompLex from the Lexical Complexity Prediction 2021 dataset.
1 code implementation • LREC 2022 • Andrei-Marius Avram, Darius Catrina, Dumitru-Clementin Cercel, Mihai Dascălu, Traian Rebedea, Vasile Păiş, Dan Tufiş
In this work, we introduce three light and fast versions of distilled BERT models for the Romanian language: Distil-BERT-base-ro, Distil-RoBERT-base, and DistilMulti-BERT-base-ro.
no code implementations • SEMEVAL 2021 • Andrei Paraschiv, Dumitru-Clementin Cercel, Mihai Dascalu
The real-world impact of polarization and toxicity in the online sphere marked the end of 2020 and the beginning of this year in a negative way.
no code implementations • SEMEVAL 2021 • George-Eduard Zaharia, Dumitru-Clementin Cercel, Mihai Dascalu
Our models are applicable on both subtasks and achieve good performance results, with a MAE below 0. 07 and a Person correlation of . 73 for single word identification, as well as a MAE below 0. 08 and a Person correlation of . 79 for multiple word targets.
no code implementations • SEMEVAL 2021 • Răzvan-Alexandru Smădu, Dumitru-Clementin Cercel, Mihai Dascalu
Detecting humor is a challenging task since words might share multiple valences and, depending on the context, the same words can be even used in offensive expressions.
no code implementations • SEMEVAL 2021 • Andrei-Marius Avram, George-Eduard Zaharia, Dumitru-Clementin Cercel, Mihai Dascalu
Extracting semantic information on measurements and counts is an important topic in terms of analyzing scientific discourses.
no code implementations • SEMEVAL 2020 • Mircea-Adrian Tanase, Dumitru-Clementin Cercel, Costin-Gabriel Chiru
Offensive language detection is one of the most challenging problem in the natural language processing field, being imposed by the rising presence of this phenomenon in online social media.
no code implementations • 2 Oct 2020 • George-Eduard Zaharia, Dumitru-Clementin Cercel, Mihai Dascalu
Our aim is to provide evidence that the proposed models can learn the characteristics of complex words in a multilingual environment by relying on the CWI shared task 2018 dataset available for four different languages (i. e., English, German, Spanish, and also French).
no code implementations • SEMEVAL 2020 • Andrei Paraschiv, Dumitru-Clementin Cercel, Mihai Dascalu
Manipulative and misleading news have become a commodity for some online news outlets and these news have gained a significant impact on the global mindset of people.
3 code implementations • SEMEVAL 2020 • Andrei-Marius Avram, Dumitru-Clementin Cercel, Costin-Gabriel Chiru
This work presents our contribution in the context of the 6th task of SemEval-2020: Extracting Definitions from Free Text in Textbooks (DeftEval).
no code implementations • SEMEVAL 2020 • George-Eduard Zaharia, George-Alexandru Vlad, Dumitru-Clementin Cercel, Traian Rebedea, Costin-Gabriel Chiru
In this paper, we describe the systems developed by our team for SemEval-2020 Task 9 that aims to cover two well-known code-mixed languages: Hindi-English and Spanish-English.
no code implementations • SEMEVAL 2020 • George-Alexandru Vlad, George-Eduard Zaharia, Dumitru-Clementin Cercel, Costin-Gabriel Chiru, Stefan Trausan-Matu
Users from the online environment can create different ways of expressing their thoughts, opinions, or conception of amusement.
no code implementations • WS 2019 • George-Alex Vlad, ru, Mircea-Adrian Tanase, Cristian Onose, Dumitru-Clementin Cercel
In recent years, the need for communication increased in online social media.
no code implementations • WS 2019 • Cristian Onose, Dumitru-Clementin Cercel, Stefan Trausan-Matu
This paper describes our models for the Moldavian vs. Romanian Cross-Topic Identification (MRC) evaluation campaign, part of the VarDial 2019 workshop.
no code implementations • RANLP 2017 • Dumitru-Clementin Cercel, Cristian Onose, Stefan Trausan-Matu, Florin Pop
Understanding questions and answers in QA system is a major challenge in the domain of natural language processing.