Search Results for author: Dumitru-Clementin Cercel

Found 21 papers, 4 papers with code

Approaching SMM4H 2020 with Ensembles of BERT Flavours

no code implementations SMM4H (COLING) 2020 George-Andrei Dima, Andrei-Marius Avram, Dumitru-Clementin Cercel

This paper describes our solutions submitted to the Social Media Mining for Health Applications (#SMM4H) Shared Task 2020.

Exploring the Power of Romanian BERT for Dialect Identification

no code implementations VarDial (COLING) 2020 George-Eduard Zaharia, Andrei-Marius Avram, Dumitru-Clementin Cercel, Traian Rebedea

Dialect identification represents a key aspect for improving a series of tasks, for example, opinion mining, considering that the location of the speaker can greatly influence the attitude towards a subject.

Dialect Identification Opinion Mining

UPB at SemEval-2022 Task 5: Enhancing UNITER with Image Sentiment and Graph Convolutional Networks for Multimedia Automatic Misogyny Identification

1 code implementation SemEval (NAACL) 2022 Andrei Paraschiv, Mihai Dascalu, Dumitru-Clementin Cercel

In recent times, the detection of hate-speech, offensive, or abusive language in online media has become an important topic in NLP research due to the exponential growth of social media and the propagation of such messages, as well as their impact.

Abusive Language Hate Speech Detection

Domain Adaptation in Multilingual and Multi-Domain Monolingual Settings for Complex Word Identification

no code implementations ACL 2022 George-Eduard Zaharia, Răzvan-Alexandru Smădu, Dumitru-Clementin Cercel, Mihai Dascalu

Our model obtains a boost of up to 2. 42% in terms of Pearson Correlation Coefficients in contrast to vanilla training techniques, when considering the CompLex from the Lexical Complexity Prediction 2021 dataset.

Complex Word Identification Domain Adaptation +2

Distilling the Knowledge of Romanian BERTs Using Multiple Teachers

1 code implementation LREC 2022 Andrei-Marius Avram, Darius Catrina, Dumitru-Clementin Cercel, Mihai Dascălu, Traian Rebedea, Vasile Păiş, Dan Tufiş

In this work, we introduce three light and fast versions of distilled BERT models for the Romanian language: Distil-BERT-base-ro, Distil-RoBERT-base, and DistilMulti-BERT-base-ro.

Dialect Identification Knowledge Distillation +8

UPB at SemEval-2021 Task 5: Virtual Adversarial Training for Toxic Spans Detection

no code implementations SEMEVAL 2021 Andrei Paraschiv, Dumitru-Clementin Cercel, Mihai Dascalu

The real-world impact of polarization and toxicity in the online sphere marked the end of 2020 and the beginning of this year in a negative way.

Toxic Spans Detection

UPB at SemEval-2021 Task 1: Combining Deep Learning and Hand-Crafted Features for Lexical Complexity Prediction

no code implementations SEMEVAL 2021 George-Eduard Zaharia, Dumitru-Clementin Cercel, Mihai Dascalu

Our models are applicable on both subtasks and achieve good performance results, with a MAE below 0. 07 and a Person correlation of . 73 for single word identification, as well as a MAE below 0. 08 and a Person correlation of . 79 for multiple word targets.

Lexical Complexity Prediction Word Embeddings

UPB at SemEval-2021 Task 7: Adversarial Multi-Task Learning for Detecting and Rating Humor and Offense

no code implementations SEMEVAL 2021 Răzvan-Alexandru Smădu, Dumitru-Clementin Cercel, Mihai Dascalu

Detecting humor is a challenging task since words might share multiple valences and, depending on the context, the same words can be even used in offensive expressions.

Multi-Task Learning text-classification +1

UPB at SemEval-2020 Task 12: Multilingual Offensive Language Detection on Social Media by Fine-tuning a Variety of BERT-based Models

no code implementations SEMEVAL 2020 Mircea-Adrian Tanase, Dumitru-Clementin Cercel, Costin-Gabriel Chiru

Offensive language detection is one of the most challenging problem in the natural language processing field, being imposed by the rising presence of this phenomenon in online social media.

Cross-Lingual Transfer Learning for Complex Word Identification

no code implementations2 Oct 2020 George-Eduard Zaharia, Dumitru-Clementin Cercel, Mihai Dascalu

Our aim is to provide evidence that the proposed models can learn the characteristics of complex words in a multilingual environment by relying on the CWI shared task 2018 dataset available for four different languages (i. e., English, German, Spanish, and also French).

Complex Word Identification Cross-Lingual Transfer +3

UPB at SemEval-2020 Task 11: Propaganda Detection with Domain-Specific Trained BERT

no code implementations SEMEVAL 2020 Andrei Paraschiv, Dumitru-Clementin Cercel, Mihai Dascalu

Manipulative and misleading news have become a commodity for some online news outlets and these news have gained a significant impact on the global mindset of people.

Propaganda span identification

Cannot find the paper you are looking for? You can Submit a new open access paper.