Search Results for author: Wissam Antoun

Found 12 papers, 5 papers with code

Beyond Dataset Creation: Critical View of Annotation Variation and Bias Probing of a Dataset for Online Radical Content Detection

no code implementations16 Dec 2024 Arij Riabi, Virginie Mouilleron, Menel Mahamdi, Wissam Antoun, Djamé Seddah

The proliferation of radical content on online platforms poses significant risks, including inciting violence and spreading extremist ideologies.

Fairness

CamemBERT 2.0: A Smarter French Language Model Aged to Perfection

no code implementations13 Nov 2024 Wissam Antoun, Francis Kulumba, Rian Touchent, Éric de la Clergerie, Benoît Sagot, Djamé Seddah

In this paper, we introduce two new versions of the CamemBERT base model-CamemBERTav2 and CamemBERTv2-designed to address these challenges.

Language Modeling Language Modelling +1

Harvesting Textual and Structured Data from the HAL Publication Repository

no code implementations30 Jul 2024 Francis Kulumba, Wissam Antoun, Guillaume Vimont, Laurent Romary

HAL (Hyper Articles en Ligne) is the French national publication repository, used by most higher education and research organizations for their open science policy.

Authorship Attribution Fill Mask +5

From Text to Source: Results in Detecting Large Language Model-Generated Content

no code implementations23 Sep 2023 Wissam Antoun, Benoît Sagot, Djamé Seddah

The research also explores Model Attribution, encompassing source model identification, model family, and model size classification, in addition to quantization and watermarking detection.

Attribute Language Modeling +4

Towards a Robust Detection of Language Model Generated Text: Is ChatGPT that Easy to Detect?

no code implementations9 Jun 2023 Wissam Antoun, Virginie Mouilleron, Benoît Sagot, Djamé Seddah

This paper proposes a methodology for developing and evaluating ChatGPT detectors for French text, with a focus on investigating their robustness on out-of-domain data and against common attack schemes.

Adversarial Text Language Modeling +1

Data-Efficient French Language Modeling with CamemBERTa

no code implementations2 Jun 2023 Wissam Antoun, Benoît Sagot, Djamé Seddah

In this paper, we introduce CamemBERTa, a French DeBERTa model that builds upon the DeBERTaV3 architecture and training objective.

Dependency Parsing FLUE +6

Empathetic BERT2BERT Conversational Model: Learning Arabic Language Generation with Little Data

1 code implementation EACL (WANLP) 2021 Tarek Naous, Wissam Antoun, Reem A. Mahmoud, Hazem Hajj

The shortcomings of NLG encoder-decoder models are primarily due to the lack of Arabic datasets suitable to train NLG models such as conversational agents.

Decoder Empathetic Response Generation +4

AraGPT2: Pre-Trained Transformer for Arabic Language Generation

1 code implementation EACL (WANLP) 2021 Wissam Antoun, Fady Baly, Hazem Hajj

In this paper, we develop the first advanced Arabic language generation model, AraGPT2, trained from scratch on a large Arabic corpus of internet text and news articles.

Language Modeling Language Modelling +2

AraELECTRA: Pre-Training Text Discriminators for Arabic Language Understanding

1 code implementation EACL (WANLP) 2021 Wissam Antoun, Fady Baly, Hazem Hajj

Advances in English language representation enabled a more sample-efficient pre-training task by Efficiently Learning an Encoder that Classifies Token Replacements Accurately (ELECTRA).

Language Modeling Language Modelling +6

Multi-Task Learning using AraBert for Offensive Language Detection

no code implementations LREC 2020 Dj, Marc ji, Fady Baly, Wissam Antoun, Hazem Hajj

The shared task on Offensive Language Detection at the OSACT4 has aimed at achieving state of art profane language detection methods for Arabic social media.

Language Modeling Language Modelling +1

AraBERT: Transformer-based Model for Arabic Language Understanding

3 code implementations LREC 2020 Wissam Antoun, Fady Baly, Hazem Hajj

Recently, with the surge of transformers based models, language-specific BERT based models have proven to be very efficient at language understanding, provided they are pre-trained on a very large corpus.

model named-entity-recognition +5

hULMonA: The Universal Language Model in Arabic

1 code implementation WS 2019 Obeida ElJundi, Wissam Antoun, Nour El Droubi, Hazem Hajj, Wassim El-Hajj, Khaled Shaban

Experiment results show that the developed hULMonA and multi-lingual ULM are able to generalize well to multiple Arabic data sets and achieve new state of the art results in Arabic Sentiment Analysis for some of the tested sets.

Arabic Sentiment Analysis General Classification +6

Cannot find the paper you are looking for? You can Submit a new open access paper.